Skip to content

During the training process, InternVL3.5 encountered an error and the error message mentioned Qwen.在训练过程中,InternVL3.5遇到错误,错误信息中提到Qwen。 #9070

@2655324127

Description

@2655324127

Reminder

  • I have read the above rules and searched the existing issues.

System Info

  • llamafactory version: 0.9.4.dev0
  • Platform: Linux-6.6.87.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
  • Python version: 3.12.9
  • PyTorch version: 2.8.0+cu128 (GPU)
  • Transformers version: 4.52.4
  • Datasets version: 3.4.1
  • Accelerate version: 1.7.0
  • PEFT version: 0.15.2
  • TRL version: 0.9.6
  • GPU type: NVIDIA GeForce RTX 5060 Ti
  • GPU number: 1
  • GPU memory: 15.93GB
  • DeepSpeed version: 0.17.2
  • Bitsandbytes version: 0.46.0
  • Git commit: 4ba7de0
  • Default data directory: detected

Reproduction

Put your message here.

[INFO|processing_utils.py:928] 2025-09-03 15:36:34,353 >> loading configuration file InternVL3_5-4B/processor_config.json
Traceback (most recent call last):
File "/home/zysoft/Leo/LLaMA-Factory-main/src/llamafactory/model/loader.py", line 102, in load_tokenizer
processor = AutoProcessor.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/anaconda3/envs/llama-factory/lib/python3.12/site-packages/transformers/models/auto/processing_auto.py", line 376, in from_pretrained
return processor_class.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/anaconda3/envs/llama-factory/lib/python3.12/site-packages/transformers/processing_utils.py", line 1187, in from_pretrained
return cls.from_args_and_dict(args, processor_dict, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/anaconda3/envs/llama-factory/lib/python3.12/site-packages/transformers/processing_utils.py", line 982, in from_args_and_dict
processor = cls(*args, **processor_dict)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/anaconda3/envs/llama-factory/lib/python3.12/site-packages/transformers/models/internvl/processing_internvl.py", line 95, in init
self.start_image_token = tokenizer.start_image_token
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/anaconda3/envs/llama-factory/lib/python3.12/site-packages/transformers/tokenization_utils_base.py", line 1111, in getattr
raise AttributeError(f"{self.class.name} has no attribute {key}")
AttributeError: Qwen2TokenizerFast has no attribute start_image_token

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/root/anaconda3/envs/llama-factory/bin/llamafactory-cli", line 8, in
sys.exit(main())
^^^^^^
File "/home/zysoft/Leo/LLaMA-Factory-main/src/llamafactory/cli.py", line 151, in main
COMMAND_MAPcommand
File "/home/zysoft/Leo/LLaMA-Factory-main/src/llamafactory/train/tuner.py", line 110, in run_exp
_training_function(config={"args": args, "callbacks": callbacks})
File "/home/zysoft/Leo/LLaMA-Factory-main/src/llamafactory/train/tuner.py", line 72, in _training_function
run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
File "/home/zysoft/Leo/LLaMA-Factory-main/src/llamafactory/train/sft/workflow.py", line 48, in run_sft
tokenizer_module = load_tokenizer(model_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zysoft/Leo/LLaMA-Factory-main/src/llamafactory/model/loader.py", line 114, in load_tokenizer
raise OSError("Failed to load processor.") from e
OSError: Failed to load processor.

Others

  1. 我已经看过并且下载的是从huggingface上的OpenGVLab/InternVL3_5-4B模型,非魔搭上下载的,并且对照了readme中的配置template,为intern_vl
  2. 配置文件如下:
    model_name_or_path: InternVL3_5-4B
    image_max_pixels: 1053696

image_min_pixels: 200704

min_pixels=256 * 28 * 28, # 200,704

max_pixels=1280 * 28 * 28 # 1,003,520

2116800

trust_remote_code: true

flash_attn: fa2
enable_liger_kernel: true
use_unsloth_gc: false
use_unsloth: false
seed: 64

method

stage: sft
do_train: true
finetuning_type: lora
lora_rank: 64
lora_target: all
freeze_vision_tower: false

dataset

dataset: 0902_planner_dataset_1,0902_planner_dataset_2

template: qwen2_vl

template: intern_vl
cutoff_len: 12288
max_samples: 1000000
overwrite_cache: true
preprocessing_num_workers: 4
dataloader_num_workers: 4
media_dir: data/0902_planner_dataset

output

output_dir: saves/0902_planner_dataset_1
logging_steps: 1
save_steps: 500
save_total_limit: 0
save_strategy: steps
plot_loss: true
overwrite_output_dir: true
save_only_model: false
report_to: swanlab # choices: [none, wandb, tensorboard, swanlab, mlflow]

train

per_device_train_batch_size: 2
gradient_accumulation_steps: 2
learning_rate: 5.0e-5
num_train_epochs: 6.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: true
ddp_timeout: 180000000
resume_from_checkpoint: null

Metadata

Metadata

Assignees

No one assigned

    Labels

    duplicateThis issue or pull request already exists

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions