We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
transformers: 4.43.2 megatron-LM: lasted in 2024.12.30
I use "sh run_mcore_qwen.sh dsw 7B 1 32 1e-5 1e-6 2048 2048 bf16 2 1 1 true false true false 100000 llava-datasets/LLaVA-Pretrain/wds llava-datasets/LLaVA-Pretrain/wds models/qwen2-vl-ckpts/Qwen__Qwen2-VL-7B-Instruct-tp2pp1 20000 200 output_mcore_qwen2vl_pretrain"
rank0]: TypeError: Attention.forward() got an unexpected keyword argument 'attention_bias' from Megatron-LM-main/megatron/core/transformer/transformer_layer.py as follow. fc753e402a7ab1e6d37f60662de76d87db1489e8d93dd282d94226f7e0e7f9adL1VzZXJzL2ppYXpoaWp1bi9MaWJyYXJ5L0FwcGxpY2F0aW9uIFN1cHBvcnQvQW50ZGluZy80NzY3NTgxNDQ2X3YyL0ltYWdlRmlsZXMvMTczNTQ4Nzc4OTI5MV80MTAwMERBMi1DNDZBLTQ2RUQtODVEMC01MzY1MDAyRkI1OEMucG5n
The text was updated successfully, but these errors were encountered:
Please use Megatron-LM-241113
Sorry, something went wrong.
raise Exception("No dot product attention support for the provided inputs!")
It is ok when i use flashattention.
Please provide your full environment, especially the version of TransformerEngine
No branches or pull requests
Version:
transformers: 4.43.2
megatron-LM: lasted in 2024.12.30
Input:
I use "sh run_mcore_qwen.sh dsw 7B 1 32 1e-5 1e-6 2048 2048 bf16 2 1 1 true false true false 100000 llava-datasets/LLaVA-Pretrain/wds llava-datasets/LLaVA-Pretrain/wds models/qwen2-vl-ckpts/Qwen__Qwen2-VL-7B-Instruct-tp2pp1 20000 200 output_mcore_qwen2vl_pretrain"
output:
rank0]: TypeError: Attention.forward() got an unexpected keyword argument 'attention_bias' from Megatron-LM-main/megatron/core/transformer/transformer_layer.py as follow.
fc753e402a7ab1e6d37f60662de76d87db1489e8d93dd282d94226f7e0e7f9adL1VzZXJzL2ppYXpoaWp1bi9MaWJyYXJ5L0FwcGxpY2F0aW9uIFN1cHBvcnQvQW50ZGluZy80NzY3NTgxNDQ2X3YyL0ltYWdlRmlsZXMvMTczNTQ4Nzc4OTI5MV80MTAwMERBMi1DNDZBLTQ2RUQtODVEMC01MzY1MDAyRkI1OEMucG5n
The text was updated successfully, but these errors were encountered: