Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MinoanER datasets #10

Open
dobraczka opened this issue Apr 19, 2023 · 2 comments
Open

Add MinoanER datasets #10

dobraczka opened this issue Apr 19, 2023 · 2 comments

Comments

@dobraczka
Copy link
Owner

Presented in this paper, link to datasets

@dobraczka
Copy link
Owner Author

The dataset has some problems. For each source, the entities have been replaced by ids. However only the subject has been replaced. For example in D5 in DBpedia <dbp:eric_r%c3%bccker_eddison> has the id 1692349, but triples like 738353 <dbo:author> <dbp:eric_r%c3%bccker_eddison> exists. What's more problematic even is that in the ground truth the URL encoding is not used and the characters are missing and mixed-case is used. For D5 this is the respective line in the ground truth: <http://data.archiveshub.ac.uk/id/person/othersource/eddisonericrucker1882-1945author> <dbp:Eric_Rcker_Eddison>.

@dobraczka
Copy link
Owner Author

Another problem, is that the ground truth contains entities that do not show up in the dataset. For example for D5 the line <http://data.archiveshub.ac.uk/id/person/ncarules/streetarthurgeorge1892-1966farmerauthorandjournalist> <dbp:A._G._Street> contains the <dbp:A._G._Street> which does not show up anywhere anymore regardless of case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

1 participant