-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Assertion Error when Setting pipe_parallel_size or model_parallel_size in GPT-NeoX #1251
Labels
bug
Something isn't working
Comments
What happens if you add |
Hi @StellaAthena! Is it because I enabled both parallel mode and Zero Stage 3 at the same time that caused this error? |
Zero-3 and PP > 1 should error but I'm surprised it would error like this? Does it go away if you use zero-1? |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hello,
I am encountering an issue with the GPT-NeoX library. When I set either
pipe_parallel_size
ormodel_parallel_size
to 2, I get the following assertion error:I am trying to enable parallelism but this error is preventing me from proceeding.
Here are some details about my setup:
Below is the content of my
2-7B.yml
configuration file:I am using the following command to start the training:
I would appreciate any guidance or suggestions on how to resolve this issue.
Thank you!
The text was updated successfully, but these errors were encountered: