-
Notifications
You must be signed in to change notification settings - Fork 1k
Open
Description
It seems that using muon alone (zero0) is also not working. I tested deespeed zero1, 2, all lead to the same error. Please consider supporting Muon Optimizer alongside deepseed.
swift sft
--deepspeed zero0 \
--optimizer muon \
...
[rank0]: File "/root/.cache/modelscope/Moonlight/examples/toy_train.py", line 60, in zeropower_via_newtonschulz5
[rank0]: assert len(G.shape) == 2
[rank0]: ^^^^^^^^^^^^^^^^^
[rank0]: AssertionError
alhong1-lab
Metadata
Metadata
Assignees
Labels
No labels