Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More information about the specific implementation of relative attention bias. #181

Open
buaaliyi opened this issue Feb 5, 2025 · 1 comment

Comments

@buaaliyi
Copy link

buaaliyi commented Feb 5, 2025

Hi @jiaqizhai

We can find some previous issues,

#148
#36

which mentioned the specific implementation of relative attention bias.
For the exact rab_{p, t}(i, j) setting in this particular codebase, the relative timespan between token i and j is calculated by timestamp[i] and timestamp[j+1] (not timestamp[j]), furthermore, the relative positional gap between token i and j is calculated by N - (j - i) (not the same like "google text-to-text transformers" paper source codebase, which is j - i). So that I am strange about this rab implementation, but what does it advantages for?

Thank you for more information about the implementation details.

@buaaliyi
Copy link
Author

buaaliyi commented Feb 5, 2025

In addition, what about the rab for ranking task, the same as the above retrieval task or not?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant