Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
dobraczka authored Mar 13, 2024
1 parent 1c9dee7 commit 5ee966d
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# !! Update 2024-02-24 (fixed in 1.1.0) !!
We found that `ent_links` in some cases contained intra-dataset links, which is not immediately noticable by the user.
Another round of clerical review was performed, transitive links, which were previously missed are added and the `ent_links` files now only contain entity links _between_ the datasets. The `721_5fold` directories have been adapted accordingly.
Another round of clerical review was performed, (transitive) links, which were previously missed are added and the `ent_links` files now only contain entity links _between_ the datasets. The `721_5fold` directories have been adapted accordingly.
The intra-dataset links are now in `{dataset_name}_intra_ent_links` for each of the three datasets.
What might also not be immediately obvious is that this dataset can be used as multi-source entity resolution task.
We therefore provide a `multi_source_cluster` file with each line consisting of a cluster id and comma-seperated cluster members of the three datasets, which can also include multiple entries for a single dataset.
Expand Down Expand Up @@ -60,7 +60,7 @@ For the binary cases each dataset has a `cluster` file in the respective folder.
For the multi-source setting, you can use the `multi_source_cluster` file in the `data` folder.
Using [`sylloge`](https://github.com/dobraczka/sylloge) you can also easily load this dataset as a multi-source task:

```
```python
from sylloge import MovieGraphBenchmark
ds = MovieGraphBenchmark(graph_pair='multi')
```
Expand Down

0 comments on commit 5ee966d

Please sign in to comment.