Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing License #37

Open
salisaresama opened this issue Aug 25, 2020 · 1 comment
Open

Missing License #37

salisaresama opened this issue Aug 25, 2020 · 1 comment

Comments

@salisaresama
Copy link

Dear authors of the annotated corpus,

the data you have shared is quite interesting. Could I ask you under what license is it released, as I was not able to find any clear statement apart from "released for the purpose of contributing to the research of natural language processing"? Is it only for research purposes then or can it also be used for training of commercially used models?

Thank you in advance for your answers!

V.

@polm
Copy link

polm commented Aug 27, 2020

I am not a maintainer of this project, but this has come up before and I believe it's impossible for them to release the corpus under any ordinary license because of how it's collected. From the README:

Since the collected documents are fragmentary, i.e., only the lead three sentences of each Web document, we have not obtained permission from copyright owners of the Web documents and do not provide source information such as URL. If copyright owners of Web documents request addition of source information or deletion of these documents, we will update the corpus and newly release it. In this case, please delete the downloaded old version and replace it with the new version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants