[事前学習] - Instruction pre-training実証実験 #95

odashi · 2024-12-09T16:38:50Z

Overview

LLM-jpのbaseモデルに対してinstruction pre-trainingを行ったとき、どのように性能が変化するのかを検証する。

1.8B, 3.7B, 7.2B, 13Bで実験を行い、172Bでの有望な設定を検証する。

使用するデータの候補:

データの導入方法

計算機
- クラスタ: Sakura (Ishikari)
- ノード種別: gpu (H100x8)
- ノード台数: 16台程度
コード
- リポジトリ: FIXME https://github.com/{org}/{repo}
- コミット: FIXME xxxxxx
入力データ:
- instruction pretraining用整備済みデータセット: [コーパス] - instruction pretraining のデータ準備 #84
出力データ:
- 保存先: {cluster}:/data/experiments/{number}
- データ内訳:
  - {name}: xxx TB （バッファ容量を含む）
W&B ログ:
- https://wandb.ai/{team}/{project} FIXME
開始日: 2024-12-10
終了予定日: 2025-01-31 （バッファ期間を含む）

The text was updated successfully, but these errors were encountered:

odashi added the pretrain Experiment of model pretrain label Dec 9, 2024

odashi changed the title ~~[事前学習] -~~ [事前学習] - Instruction pre-training実証実験 Dec 9, 2024

odashi self-assigned this Dec 9, 2024