Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[マルチモーダルWG] - Mamba VLMモデル学習 #81

Open
daichi1207 opened this issue Nov 17, 2024 · 0 comments
Open

[マルチモーダルWG] - Mamba VLMモデル学習 #81

daichi1207 opened this issue Nov 17, 2024 · 0 comments
Assignees

Comments

@daichi1207
Copy link

daichi1207 commented Nov 17, 2024

Overview

既存のMLLM(LLaVA-phi)を教師モデルとしてSSM(Mamba)ベースのVLMへ蒸留を行います.

Details

  • 教師モデル
コンポーネント LLaVA-phi (3B)
Vision Encoder openai/clip-vit-large-patch14-336
Projector mlp2x_gelu
Language Model Phi-3-mini-128k-instruct
  • 生徒モデル
    • Vision Encoderは教師モデルのまま
    • Projectorは同じアーキテクチャで教師モデルのパラメータで初期化
    • Language Modelは親モデルのTransformer部分をMamba2で置き換える
    • データは親モデルが使用した同じものを用いる

Resources

  • 計算機
    • クラスタ: mdx (llm-jp-nvlink)
    • ノード種別: gpu (A100x8)
    • ノード台数: 2
  • コード
  • 入力データ:
    • llm-jp-nvlink:/model/dyashima/phimamba/playground/data/llava_v1_5_mix665k.json
  • 出力データ:
    • 保存先: llm-jp-nvlink:/model/experiments/0081_phimamba
    • データ内訳:
      • checkpoint : 0.15 TB (バッファ容量を含む)
  • W&B ログ:
  • 開始日: 2024/11/18
  • 終了予定日: 2024-12-31 (バッファ期間を含む)
@daichi1207 daichi1207 added pretrain Experiment of model pretrain and removed pretrain Experiment of model pretrain labels Nov 17, 2024
@daichi1207 daichi1207 self-assigned this Nov 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant