-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend dataset source to music datasets #110
Comments
Thanks for bringing this to our attention. We already have some music/noise datasets in our collection, for example GTZAN. So we do not have anything against music. We just have not worked that much with it, therefore the selection is significantly smaller. Speaking for myself: I am open to collaboration. Increasing the number of dependencies is my least favourite option, though. We already have too many and it increases the complexity for us and our users. |
cc @lostanlen @magdalenafuentes Hey @aahlenst ! We're also open to collaboration. For some context, the goal of mirdata is to act a bit like sklearn.datasets but for music. mirdata is much less standardized that sklearn.datasets or audiomate because we're supporting many different tasks and task definitions. We've converged on supporting
On our side, we don't have any plans to go beyond music datasets, and so far it seems like mirdata and audiomate are quite complementary. @faroit I'm curious to hear how you see these two library's interacting (if at all), or how we can better support the use cases that audiomate provides and we don't. |
@rabitt sorry for the slow response. I do not have a strong opinion about how to collaborate. I am currently only reviewing this package and to me, there is a significant overlap between mirdata and this project that should somehow be noted. I think the minimum solution would be a statement on both projects with related dataloading python packages listing each other. Maybe it would be great to discuss further things in the future, but I would encourage the project owners @ynop and @rabitt/@lostanlen @magdalenafuentes to discuss this directly. |
If anyone wants to work on this: I‘d create a separate module that depends on both libraries. This can either live here or in its own repository. This shields both projects from additional dependencies. If guidance is needed or infrastructure (interfaces, methods, ...) missing on audiomate‘s side, please let us know. |
Hey @aahlenst ! Thanks for taking the initiative on this, we're happy to help. Also, thanks @faroit for pointing this out, agree we should discuss on best ways of both projects to co-exist and hopefully enhance each other. The idea of a separate module sounds good, though could you explain a little bit more what do you have in mind? |
Give the package name __audio__mate (instead of speechmate ;-)), it would be great if also music datasets could be included here. As many of the music sets were already listed in #44, for music dataset, another python package called mir_data does already exist. It comes with less features (e.g. no processing) as it focus more on metadata than on audio loading. However it would be great to find a way for both packages to co-exist.
One way could be, for example to make mir_data a dependency of audiomate to load these additional datasets without duplicating code.
This issue should just trigger a discussion here, so I would like to include @rabitt @lostanlen @magdalenafuentes here
The text was updated successfully, but these errors were encountered: