Skip to content

[QUESTION]Where does the attention_mask come from when the gpt_model is not the first or last pipeline stage? #983

Unanswered
janelu9 asked this question in Q&A
Discussion options

You must be logged in to vote

Replies: 0 comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
1 participant
Converted from issue

This discussion was converted from issue #861 on August 07, 2024 18:28.