Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: allow kwargs in init for RerankingWrapper #1676

Merged
merged 3 commits into from
Jan 9, 2025
Merged

Conversation

Samoed
Copy link
Collaborator

@Samoed Samoed commented Jan 1, 2025

Checklist

  • Run tests locally to make sure nothing is broken using make test.
  • Run the formatter to format the code using make lint.

Closes #1671

  • When running from the CLI, the device is passed to the model, but RerankingWrapper didn’t account for it.
  • In reranking tasks, there are no instructions for each sentence, so None is appended to the end of the query.
  • The predict function initially expected query, passage, and instruction every time, but now it checks the number of parameters passed.

Copy link
Collaborator

@isaac-chung isaac-chung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The command mentioned in the issue was mteb run -m jhu-clsp/FollowIR-7B -t TwitterHjerneRetrieval. Could you please run it and see whether the same error still shows up?

mteb/models/rerankers_custom.py Show resolved Hide resolved
@Samoed
Copy link
Collaborator Author

Samoed commented Jan 1, 2025

I tried running jinaai/jina-reranker-v2-base-multilingual, and it initially gave the same issues, but now it's running fine. Here's the result: TwitterHjerneRetrieval.json.

TwitterHjerneRetrieval is a retrieval task, so maybe we don't want to run it. I also fixed some additional errors. @orionw, could you take a look at the changes?

@Samoed Samoed requested a review from orionw January 1, 2025 20:49
@Samoed
Copy link
Collaborator Author

Samoed commented Jan 8, 2025

@orionw Can you look to the changes?

Copy link
Contributor

@orionw orionw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay, got lost in my inbox. Looks great to me, thanks for cleaning up so many of my previous errors/typos.

@@ -43,7 +43,7 @@ def corpus_to_str(
else corpus["text"][i].strip()
for i in range(len(corpus["text"]))
]
elif isinstance(corpus, list) and isinstance(corpus[0], dict):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the tuple again? I'm sure it's obvious but I am blanking

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I added this to convert corpus_in_pair

corpus_in_pair = corpus_to_str(corpus_in_pair)

I can check it tomorrow

Copy link
Collaborator Author

@Samoed Samoed Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this part was causing problems converted it to list

@Samoed Samoed enabled auto-merge (squash) January 9, 2025 11:58
@Samoed Samoed merged commit f5962c6 into main Jan 9, 2025
10 checks passed
@Samoed Samoed deleted the fix_reranker_models branch January 9, 2025 12:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reranker models do not work with cli
3 participants