Replies: 2 comments
-
We have a qlora fsdp config. You can give it a try here but change the model https://github.com/axolotl-ai-cloud/axolotl/blob/main/examples/deepseek-v2/qlora-fsdp-2_5.yaml . I'm not sure if we tested that moe model yet and require any specific change. |
Beta Was this translation helpful? Give feedback.
0 replies
-
I would also love to see an example that works. I've been unsuccessful at this. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Thank you for the great repo.
Do you have any suggestions on how to finetune DeepSeek-Coder v2 236B.
Is it possible to do fsdp + qlora for this moe model? What's them minimum required to do finetuning?
Beta Was this translation helpful? Give feedback.
All reactions