Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More data sources #1575

Open
48 tasks
henrykironde opened this issue Apr 5, 2021 · 5 comments
Open
48 tasks

More data sources #1575

henrykironde opened this issue Apr 5, 2021 · 5 comments

Comments

@henrykironde
Copy link
Contributor

henrykironde commented Apr 5, 2021

@pri1311
Copy link

pri1311 commented Feb 10, 2022

Hey @henrykironde! Would love to start contributing, and I believe adding datasets might be a good place to start. Could I pick one up from the lot or would you be assigning any particular one?

@henrykironde
Copy link
Contributor Author

Hi @pri1311, Feel free to pick any data source. Let me know in case you need any clarification.

@pri1311
Copy link

pri1311 commented Feb 12, 2022

Let me know in case you need any clarification.

I have added a simple dataset as of now to get a basic idea of the repository. If the PR is merged/approved, will move on to more datasets. I am particularly interested in a separate open issue - Adding support for sequence data.

@pri1311
Copy link

pri1311 commented Feb 14, 2022

Also, I had one small doubt. I was going through some of the json files in the retriever-recipes repository. A lot of the Kaggle datasets were included. But since Kaggle allows downloading test and train data all at once as a zip file, how will those be added to this package? (Since I saw Kaggle mentioned as one of the data sources here.)

@henrykironde
Copy link
Contributor Author

@pri1311 for sequence data, I have not found suitable sources yet, but you can go fo it.

since Kaggle allows downloading test and train data all at once as a zip file,

That is a good case since we download all the data using one url. We then extract all the files or we can extract a particular file. Checkout the Json files with extract for some examples. https://github.com/weecology/retriever-recipes/search?q=extract.

Let me know incase you have more issues or need clarification.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants