You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Instead of only patching the transformers mllama module (transformers.models.mllama.modeling_mllama), apply_liger_kernel_to_mllama modifies torch.nn.LayerNorm globally.
The fix would be to:
(1) Not patch LayerNorm in Liger by assigning to modeling_mllama.nn.LayerNorm
(2) Change transformers.models.mllama.modeling_mllama to not use from torch import nn and to instead just import layernorm like from torch.nn import LayerNorm
(3) instead patch layernorm in Liger by assigning to modeling_mllama.LayerNorm
Yeah the proposed fix would require a change to transformers unfortunately. How mllama was implemented differs very slightly from the conventions in other transformers modeling files.
🐛 Describe the bug
Instead of only patching the transformers mllama module (
transformers.models.mllama.modeling_mllama
),apply_liger_kernel_to_mllama
modifiestorch.nn.LayerNorm
globally.The issue is here.
The fix would be to:
(1) Not patch LayerNorm in Liger by assigning to
modeling_mllama.nn.LayerNorm
(2) Change
transformers.models.mllama.modeling_mllama
to not usefrom torch import nn
and to instead just import layernorm likefrom torch.nn import LayerNorm
(3) instead patch layernorm in Liger by assigning to
modeling_mllama.LayerNorm
Reproduce
Versions
Environment Report:
Operating System: Linux-6.1.85+-x86_64-with-glibc2.35
Python version: 3.10.12
PyTorch version: 2.4.1+cu121
CUDA version: Not available
Triton version: 3.1.0
Transformers version: 4.45.0
The text was updated successfully, but these errors were encountered: