google/gemma-2-27b-it QLORA #1743
Replies: 1 comment 3 replies
-
Hey, sorry for late response, since it's a qlora, the impact on the weights are not as large as lora / full fine-tuning. Have you tried those other options? |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi everyone,
I have tried QLORA fine-tuning gemma-2-27b with chatml and flash-attention yesterday and the result model seems confused, even tho the loss seemed to go down and overall everything looked smooth. May someone please share tips&tricks for fine-tuning this model?
Beta Was this translation helpful? Give feedback.
All reactions