-
Notifications
You must be signed in to change notification settings - Fork 291
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUCC fails #1694
Comments
It would work without
I tried running it with pre-#1674 (
I think maybe second parameter of golden supposed to be as sentence2 , but then never used?
|
I agree. Could we try simply using sentence1 and sentence2 as is, and compare that score to the leaderboard? Separately, there's also BUCC.v2. Maybe this would supersede BUCC? |
|
Oh I see, thanks for explaining. After taking a look at the paper, the MTEB dataset seems to contain only "gold" sets. e.g. de-en has 9580 rows. This actually leads me to believe that this is the train split, and not the test split. Nonetheless, as this contains cross-lingual pairs, to leave this as a bitext mining task, we can remove
|
The text was updated successfully, but these errors were encountered: