-
Notifications
You must be signed in to change notification settings - Fork 7.1k
Open
Labels
bugSomething isn't workingSomething isn't workingpendingThis problem is yet to be addressedThis problem is yet to be addressed
Description
Reminder
- I have read the above rules and searched the existing issues.
System Info
llamafactory
version: 0.9.4.dev0- Platform: Linux-5.4.0-139-generic-x86_64-with-glibc2.31
- Python version: 3.10.18
- PyTorch version: 2.6.0+cu124 (GPU)
- Transformers version: 4.51.3
- Datasets version: 3.6.0
- Accelerate version: 1.7.0
- PEFT version: 0.17.0
- TRL version: 0.9.6
- GPU type: NVIDIA A800-SXM4-80GB
- GPU number: 8
- GPU memory: 79.32GB
- DeepSpeed version: 0.16.9
- vLLM version: 0.8.2
- Default data directory: detected
Reproduction
训练脚本:
### model
model_name_or_path: /raid/zhanghang02/weights/pretrained/Qwen2.5-VL-7B-Instruct
image_max_pixels: 262144
video_max_pixels: 262144
video_fps: 2
video_maxlen: 128
trust_remote_code: true
deepspeed: examples/deepspeed/ds_z2_config.json
### method
stage: sft
do_train: true
finetuning_type: oft
freeze_vision_tower: true
freeze_multi_modal_projector: true
freeze_language_model: false
flash_attn: fa2
oft_block_size: 128
oft_target: all
### dataset
dataset: caption_image16_train_250825
template: qwen2_vl
cutoff_len: 40960
max_samples: 1000000
# max_samples: 1000
overwrite_cache: true
preprocessing_num_workers: 32
dataloader_num_workers: 8
dataloader_pin_memory: true
### output
output_dir: /raid/zhanghang02/weights/checkpiont/caption/caption_qwen25vl_p512_image_250825_oft
logging_steps: 1
save_steps: 500
plot_loss: true
overwrite_output_dir: true
save_only_model: false
report_to: none # choices: [none, wandb, tensorboard, swanlab, mlflow]
### train
per_device_train_batch_size: 4
gradient_accumulation_steps: 1
learning_rate: 1.0e-4
num_train_epochs: 8.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: true
ddp_timeout: 180000000
resume_from_checkpoint: null
## eval
val_size: 0.1
per_device_eval_batch_size: 1
eval_strategy: steps
eval_steps: 500
# export脚本
### model
model_name_or_path: /raid/zhanghang02/weights/pretrained/Qwen2.5-VL-7B-Instruct
adapter_name_or_path: /raid/zhanghang02/weights/checkpiont/caption/caption_qwen25vl_p512_image_250825_oft/epoch1
trust_remote_code: true
### method
finetuning_type: oft
### dataset
template: qwen2_vl
### output
export_dir: /raid/zhanghang02/weights/export/caption/caption_qwen25vl_p512_image_250825_oft/caption_qwen25vl_p512_image_250825_oft_epoch1
export_size: 5
export_device: cpu
export_legacy_format: False
报错如下:
(qwen25vl_oft) [~/factory/llamafactory]: DISABLE_VERSION_CHECK=1 CUDA_VISIBLE_DEVICES=0 llamafactory-cli export \
> task/export/caption/caption_qwen25vl_p512_image_250825_oft/caption_qwen25vl_p512_image_250825_oft_epoch7.yaml
[2025-09-04 16:13:37,390] [INFO] [real_accelerator.py:254:get_accelerator] Setting ds_accelerator to cuda (auto detect)
Warning: The cache directory for DeepSpeed Triton autotune, /home/zhanghang02/.triton/autotune, appears to be on an NFS system. While thi
s is generally acceptable, if you experience slowdowns or hanging when DeepSpeed exits, it is recommended to set the TRITON_CACHE_DIR env
ironment variable to a non-NFS path.
[WARNING|2025-09-04 16:13:39] llamafactory.extras.misc:154 >> Version checking has been disabled, may lead to unexpected behaviors.
/home/zhanghang02/anaconda3/envs/qwen25vl_oft/lib/python3.10/site-packages/jieba/_compat.py:18: UserWarning: pkg_resources is deprecated
as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-
11-30. Refrain from using this package or pin to Setuptools<81.
import pkg_resources
INFO 09-04 16:13:40 [__init__.py:239] Automatically detected platform cuda.
Traceback (most recent call last):
File "/home/zhanghang02/anaconda3/envs/qwen25vl_oft/bin/llamafactory-cli", line 7, in <module>
sys.exit(main())
File "/home/zhanghang02/anaconda3/envs/qwen25vl_oft/lib/python3.10/site-packages/llamafactory/cli.py", line 151, in main
COMMAND_MAP[command]()
File "/home/zhanghang02/anaconda3/envs/qwen25vl_oft/lib/python3.10/site-packages/llamafactory/train/tuner.py", line 114, in export_mode
l
model_args, data_args, finetuning_args, _ = get_infer_args(args)
File "/home/zhanghang02/anaconda3/envs/qwen25vl_oft/lib/python3.10/site-packages/llamafactory/hparams/parser.py", line 446, in get_infe
r_args
_verify_model_args(model_args, data_args, finetuning_args)
File "/home/zhanghang02/anaconda3/envs/qwen25vl_oft/lib/python3.10/site-packages/llamafactory/hparams/parser.py", line 112, in _verify_
model_args
raise ValueError("Adapter is only valid for the LoRA method.")
ValueError: Adapter is only valid for the LoRA method.
Others
No response
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingpendingThis problem is yet to be addressedThis problem is yet to be addressed