Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workload: CSV Summary Statistics + plotting #92

Closed
enricorotundo opened this issue Apr 25, 2023 · 0 comments · Fixed by #104
Closed

Workload: CSV Summary Statistics + plotting #92

enricorotundo opened this issue Apr 25, 2023 · 0 comments · Fixed by #104
Assignees

Comments

@enricorotundo
Copy link
Contributor

enricorotundo commented Apr 25, 2023

Interesting one here, parquet doesn't have a registered mime type yet. Wonder if tika can parse?

Metadata | predicate [text/csv| application/parquet] -> Load and produce summaries of data -> Merge

Ideas:


From issue #26

@enricorotundo enricorotundo self-assigned this Apr 25, 2023
@enricorotundo enricorotundo changed the title Workload: CSV Summary Statistics Workload: CSV Summary Statistics + plotting Apr 25, 2023
@enricorotundo enricorotundo linked a pull request Apr 27, 2023 that will close this issue
enricorotundo added a commit that referenced this issue Apr 27, 2023
* feat(csv): data profiler for dataset statistics

* feat(csv): workload to validate and profile CSV files - closes #92
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant