Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does any MLLM support neat packing? #6343

Closed
1 task done
xiaosu-zhu opened this issue Dec 16, 2024 · 0 comments · Fixed by #6362
Closed
1 task done

Does any MLLM support neat packing? #6343

xiaosu-zhu opened this issue Dec 16, 2024 · 0 comments · Fixed by #6362
Labels
solved This problem has been already solved

Comments

@xiaosu-zhu
Copy link

Reminder

  • I have read the README and searched the existing issues.

System Info

llamafactory version: 0.9.2.dev0
Platform: Linux-5.4.0-202-generic-x86_64-with-glibc2.31
Python version: 3.10.15
PyTorch version: 2.1.0+cu121 (GPU)
Transformers version: 4.46.1
Datasets version: 3.1.0
Accelerate version: 1.0.1
PEFT version: 0.12.0
TRL version: 0.9.6
GPU type: NVIDIA A800-SXM4-80GB
DeepSpeed version: 0.15.4

Reproduction

I have noticed that, if neat_packing is turned on, the script will check whether the model is supported in model/model_utils/packing.py:

def configure_packing(
    config: "PretrainedConfig", model_args: "ModelArguments", is_trainable: bool
) -> None:
    if not is_trainable or not model_args.block_diag_attn:
        return

    model_type = getattr(config, "model_type", None)
    ############# HERE ################
    if model_type in SUPPORTED_CLASS_FOR_BLOCK_DIAG_ATTN:
        _patch_for_block_diag_attn(model_type)
        logger.info_rank0(
            "Using block diagonal attention for sequence packing without cross-attention."
        )
    else:
        raise ValueError("Current model does not support block diagonal attention.")

Where the SUPPORTED_CLASS_FOR_BLOCK_DIAG_ATTN contains only LLMs but no MLLMs.

SUPPORTED_CLASS_FOR_BLOCK_DIAG_ATTN = {
    "cohere",
    "falcon",
    "gemma",
    "gemma2",
    "llama",
    "mistral",
    "phi",
    "phi3",
    "qwen2",
    "starcoder2",
}

I wonder if the neat_packing has already supported the multi-modal inputs or not? Is it possible to add the support?

Expected behavior

No response

Others

No response

@github-actions github-actions bot added the pending This problem is yet to be addressed label Dec 16, 2024
@xiaosu-zhu xiaosu-zhu changed the title Does MLLMs support neat packing? Does any MLLM support neat packing? Dec 16, 2024
hiyouga added a commit that referenced this issue Dec 17, 2024
@hiyouga hiyouga added solved This problem has been already solved and removed pending This problem is yet to be addressed labels Dec 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
solved This problem has been already solved
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants