微调训练过程中学习率不会动态下降 #6347

QuanhuiGuan · 2024-12-16T08:19:05Z

Reminder

I have read the README and searched the existing issues.

System Info

python 3.9
llamafactory 0.9.1

Reproduction

我的yaml文件（如下），lora微调训练的过程学习率不会动态更新，有大佬可以指点一下吗？lr_scheduler_type: cosine 这个设置也打开了，但是还是不会动态调整学习率，也是bf16，希望大佬们指点。

model_name_or_path: /mnt/SSD_12TB/model_gallery/Qwen2.5-32B-Instruct

stage: sft
do_train: true
finetuning_type: lora
lora_target: all
deepspeed: /mnt/SSD_12TB/ethan/data/LLaMA-Factory/examples/deepspeed/ds_z3_config_Ethan.json

dataset: guanqi_train
template: qwen
overwrite_cache: true
preprocessing_num_workers: 16

output_dir: /mnt/SSD_12TB/ethan/data/LLaMA-Factory/model_save/Qwen2_32B_guangqi_1216_dova_1200
logging_steps: 1
plot_loss: true
overwrite_output_dir: true

use_dora: true
lora_rank: 8
per_device_train_batch_size: 4
gradient_accumulation_steps: 4
learning_rate: 4.0e-4
num_train_epochs: 5.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: true
ddp_timeout: 180000000
max_length: 4096
save_strategy: epoch

val_size: 0.05
per_device_eval_batch_size: 4
eval_strategy: steps
eval_steps: 5

Expected behavior

No response

Others

No response

hiyouga · 2024-12-17T03:32:52Z

check the deepspeed config

github-actions bot added the pending This problem is yet to be addressed label Dec 16, 2024

hiyouga closed this as completed Dec 17, 2024

hiyouga added solved This problem has been already solved and removed pending This problem is yet to be addressed labels Dec 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

微调训练过程中学习率不会动态下降 #6347

微调训练过程中学习率不会动态下降 #6347

QuanhuiGuan commented Dec 16, 2024 •

edited

Loading

hiyouga commented Dec 17, 2024

微调训练过程中学习率不会动态下降 #6347

微调训练过程中学习率不会动态下降 #6347

Comments

QuanhuiGuan commented Dec 16, 2024 • edited Loading

Reminder

System Info

Reproduction

Expected behavior

Others

hiyouga commented Dec 17, 2024

QuanhuiGuan commented Dec 16, 2024 •

edited

Loading