ValueError: The following `model_kwargs` are not used by the model: ['skip_special_tokens'] #6407

paolovic · 2024-12-20T12:39:46Z

Reminder

I have read the README and searched the existing issues.

System Info

This is my checked out commit:

LLaMA-Factory]$ git log --pretty=format:'%H' -n 1
ffbb4dbdb09ba799af1800c78b2e9d669bccd24b

llamafactory version: 0.9.2.dev0
Platform: Linux-4.18.0-553.27.1.el8_10.x86_64-x86_64-with-glibc2.28
Python version: 3.11.10
PyTorch version: 2.5.1+cu124 (GPU)
Transformers version: 4.46.1
Datasets version: 3.1.0
Accelerate version: 1.0.1
PEFT version: 0.12.0
TRL version: 0.9.6
GPU type: NVIDIA L40S-48C
Bitsandbytes version: 0.45.0

Reproduction

I have trained a LoRA adapter with the following examples/train_lora/llama3_lora_sft.yaml

### model
model_name_or_path: /c/models/Llama-3.3-70B-Instruct
quantization_bit: 4
trust_remote_code: true

### method
stage: sft
do_train: true
finetuning_type: lora
lora_target: all

### dataset
dataset: JB_translate
template: llama3
cutoff_len: 4096
#max_samples: 1000
overwrite_cache: true
preprocessing_num_workers: 16

### output
output_dir: saves/llama3.3-70b/lora/sft
logging_steps: 10
save_steps: 500
plot_loss: true
overwrite_output_dir: true

### train
per_device_train_batch_size: 1
gradient_accumulation_steps: 8
learning_rate: 1.0e-4
num_train_epochs: 3.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: true
ddp_timeout: 180000000

It's the original https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct.

This worked out fine using llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml.

Then I wanted to evaluate the base model llamafactory-cli train examples/extras/nlg_eval/llama3_lora_predict.yaml with the following examples/extras/nlg_eval/llama3_lora_predict.yaml

### model
model_name_or_path: /c/models/Llama-3.3-70B-Instruct
quantization_bit: 4
#adapter_name_or_path: saves/llama3.3-70b/lora/sft
trust_remote_code: true

### method
#stage: sft
do_predict: true
#finetuning_type: lora

### dataset
eval_dataset: JB_translate_test
template: llama3
cutoff_len: 4096
max_samples: 50
overwrite_cache: true
preprocessing_num_workers: 16

### output
output_dir: saves/llama3.3-70b/lora/predict
overwrite_output_dir: true

### eval
per_device_eval_batch_size: 1
predict_with_generate: true
ddp_timeout: 180000000

Unfortunately, this fails with the following message:

Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████| 30/30 [00:22<00:00,  1.34it/s]
[INFO|2024-12-20 13:36:40] llamafactory.model.model_utils.attention:157 >> Using torch SDPA for faster training and inference.
[INFO|2024-12-20 13:36:40] llamafactory.model.loader:157 >> all params: 70,553,706,496
Detected kernel version 4.18.0, which is below the recommended minimum of 5.5.0; this can cause the process to hang. It is recommended to upgr
ade the kernel to the minimum version or higher.
[WARNING|2024-12-20 13:36:40] llamafactory.train.sft.workflow:168 >> Batch generation can be very slow. Consider using `scripts/vllm_infer.py`
 instead.
[INFO|trainer.py:4117] 2024-12-20 13:36:40,670 >>
***** Running Prediction *****
[INFO|trainer.py:4119] 2024-12-20 13:36:40,670 >>   Num examples = 50
[INFO|trainer.py:4122] 2024-12-20 13:36:40,670 >>   Batch size = 1
[rank0]: Traceback (most recent call last):
[rank0]:   File "/c/packages/LLaMA-Factory/src/llamafactory/launcher.py", line 23, in <module>
[rank0]:     launch()
[rank0]:   File "/c/packages/LLaMA-Factory/src/llamafactory/launcher.py", line 19, in launch
[rank0]:     run_exp()
[rank0]:   File "/c/packages/LLaMA-Factory/src/llamafactory/train/tuner.py", line 50, in run_exp
[rank0]:     run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
[rank0]:   File "/c/packages/LLaMA-Factory/src/llamafactory/train/sft/workflow.py", line 128, in run_sft
[rank0]:     predict_results = trainer.predict(dataset_module["eval_dataset"], metric_key_prefix="predict", **gen_kwargs)
[rank0]:                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/c/environments/llama_factory_uat/lib64/python3.11/site-packages/transformers/trainer_seq2seq.py", line 259, in pre
dict
[rank0]:     return super().predict(test_dataset, ignore_keys=ignore_keys, metric_key_prefix=metric_key_prefix)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/c/environments/llama_factory_uat/lib64/python3.11/site-packages/transformers/trainer.py", line 4042, in predict
[rank0]:     output = eval_loop(
[rank0]:              ^^^^^^^^^^
[rank0]:   File "/c/environments/llama_factory_uat/lib64/python3.11/site-packages/transformers/trainer.py", line 4158, in evaluation
_loop
[rank0]:     losses, logits, labels = self.prediction_step(model, inputs, prediction_loss_only, ignore_keys=ignore_keys)
[rank0]:                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/c/packages/LLaMA-Factory/src/llamafactory/train/sft/trainer.py", line 135, in prediction_step
[rank0]:     loss, generated_tokens, _ = super().prediction_step(  # ignore the returned labels (may be truncated)
[rank0]:                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/c/environments/llama_factory_uat/lib64/python3.11/site-packages/transformers/trainer_seq2seq.py", line 331, in pre
diction_step
[rank0]:     generated_tokens = self.model.generate(**generation_inputs, **gen_kwargs)
[rank0]:                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/c/environments/llama_factory_uat/lib64/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate
_context
[rank0]:     return func(*args, **kwargs)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/c/environments/llama_factory_uat/lib64/python3.11/site-packages/transformers/generation/utils.py", line 1972, in g
enerate
[rank0]:     self._validate_model_kwargs(model_kwargs.copy())
[rank0]:   File "/c/environments/llama_factory_uat/lib64/python3.11/site-packages/transformers/generation/utils.py", line 1360, in _
validate_model_kwargs
[rank0]:     raise ValueError(
[rank0]: ValueError: The following `model_kwargs` are not used by the model: ['skip_special_tokens'] (note: typos in the generate arguments wi
ll also show up in this list)
[rank1]: Traceback (most recent call last):
[rank1]:   File "/c/packages/LLaMA-Factory/src/llamafactory/launcher.py", line 23, in <module>
[rank1]:     launch()
[rank1]:   File "/c/packages/LLaMA-Factory/src/llamafactory/launcher.py", line 19, in launch
[rank1]:     run_exp()
[rank1]:   File "/c/packages/LLaMA-Factory/src/llamafactory/train/tuner.py", line 50, in run_exp
[rank1]:     run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
[rank1]:   File "/c/packages/LLaMA-Factory/src/llamafactory/train/sft/workflow.py", line 128, in run_sft
[rank1]:     predict_results = trainer.predict(dataset_module["eval_dataset"], metric_key_prefix="predict", **gen_kwargs)
[rank1]:                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/c/environments/llama_factory_uat/lib64/python3.11/site-packages/transformers/trainer_seq2seq.py", line 259, in pre
dict
[rank1]:     return super().predict(test_dataset, ignore_keys=ignore_keys, metric_key_prefix=metric_key_prefix)
[rank1]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/c/environments/llama_factory_uat/lib64/python3.11/site-packages/transformers/trainer.py", line 4042, in predict
[rank1]:     output = eval_loop(
[rank1]:              ^^^^^^^^^^
[rank1]:   File "/c/environments/llama_factory_uat/lib64/python3.11/site-packages/transformers/trainer.py", line 4158, in evaluation
_loop
[rank1]:     losses, logits, labels = self.prediction_step(model, inputs, prediction_loss_only, ignore_keys=ignore_keys)
[rank1]:                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/c/packages/LLaMA-Factory/src/llamafactory/train/sft/trainer.py", line 135, in prediction_step
[rank1]:     loss, generated_tokens, _ = super().prediction_step(  # ignore the returned labels (may be truncated)
[rank1]:                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/c/environments/llama_factory_uat/lib64/python3.11/site-packages/transformers/trainer_seq2seq.py", line 331, in pre
diction_step
[rank1]:     generated_tokens = self.model.generate(**generation_inputs, **gen_kwargs)
[rank1]:                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/c/environments/llama_factory_uat/lib64/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate
_context
[rank1]:     return func(*args, **kwargs)
[rank1]:            ^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/c/environments/llama_factory_uat/lib64/python3.11/site-packages/transformers/generation/utils.py", line 1972, in $
enerate
[rank1]:     self._validate_model_kwargs(model_kwargs.copy())
[rank1]:   File "/c/environments/llama_factory_uat/lib64/python3.11/site-packages/transformers/generation/utils.py", line 1360, in $
validate_model_kwargs
[rank1]:     raise ValueError(
[rank1]: ValueError: The following `model_kwargs` are not used by the model: ['skip_special_tokens'] (note: typos in the generate arguments w$
ll also show up in this list)
[rank0]:[W1220 13:36:40.155273393 ProcessGroupNCCL.cpp:1250] Warning: WARNING: process group has NOT been destroyed before we destruct Proces$
GroupNCCL. On normal program exit, the application should call destroy_process_group to ensure that any pending NCCL operations have finished
in this process. In rare cases this process can exit before this point and block the progress of another member of the process group. This co$
straint has always been present,  but this warning has only been added since PyTorch 2.4 (function operator())
W1220 13:36:42.107000 2890998 torch/distributed/elastic/multiprocessing/api.py:897] Sending process 2891016 closing signal SIGTERM
E1220 13:36:42.139000 2890998 torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: 1) local_rank: 0 (pid: 2891015) of bina$
y: /c/environments/llama_factory_uat/bin/python3.11
Traceback (most recent call last):
  File "/c/environments/llama_factory_uat/bin/torchrun", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/c/environments/llama_factory_uat/lib64/python3.11/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.$
y", line 355, in wrapper
    return f(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^
  File "/c/environments/llama_factory_uat/lib64/python3.11/site-packages/torch/distributed/run.py", line 919, in main
    run(args)
  File "/c/environments/llama_factory_uat/lib64/python3.11/site-packages/torch/distributed/run.py", line 910, in run
    elastic_launch(
  File "/c/environments/llama_factory_uat/lib64/python3.11/site-packages/torch/distributed/launcher/api.py", line 138, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/c/environments/llama_factory_uat/lib64/python3.11/site-packages/torch/distributed/launcher/api.py", line 269, in launch_ag$
nt
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
/c/packages/LLaMA-Factory/src/llamafactory/launcher.py FAILED

Expected behavior

No response

Others

No response

The text was updated successfully, but these errors were encountered:

paolovic · 2024-12-20T12:53:33Z

The same happens for merging the base model with the LoRA adapter with llamafactory-cli train examples/merge_lora/llama3_lora_sft.yaml and the following examples/merge_lora/llama3_lora_sft.yaml

### Note: DO NOT use quantized model or quantization_bit when merging lora adapters

### model
model_name_or_path: /c/models/Llama-3.3-70B-Instruct
adapter_name_or_path: saves/llama3.3-70b/lora/sft
template: llama3
finetuning_type: lora
trust_remote_code: true

### export
export_dir: models/llama3_lora_sft
export_size: 2
export_device: cpu
export_legacy_format: false

(llama_factory_uat) [x_mlo-app-uat@srp24245lx LLaMA-Factory]$ llamafactory-cli train examples/merge_lora/llama3_lora_sft.yaml
[INFO|2024-12-20 13:52:01] llamafactory.cli:157 >> Initializing distributed tasks at: 127.0.0.1:25334
W1220 13:52:02.151000 2894600 torch/distributed/run.py:793]
W1220 13:52:02.151000 2894600 torch/distributed/run.py:793] *****************************************
W1220 13:52:02.151000 2894600 torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default,
to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
W1220 13:52:02.151000 2894600 torch/distributed/run.py:793] *****************************************
Traceback (most recent call last):
  File "/c/packages/LLaMA-Factory/src/llamafactory/launcher.py", line 23, in <module>
    launch()
  File "/c/packages/LLaMA-Factory/src/llamafactory/launcher.py", line 19, in launch
    run_exp()
  File "/c/packages/LLaMA-Factory/src/llamafactory/train/tuner.py", line 45, in run_exp
    model_args, data_args, training_args, finetuning_args, generating_args = get_train_args(args)
                                                                             ^^^^^^^^^^^^^^^^^^^^
  File "/c/packages/LLaMA-Factory/src/llamafactory/hparams/parser.py", line 161, in get_train_args
    model_args, data_args, training_args, finetuning_args, generating_args = _parse_train_args(args)
                                                                             ^^^^^^^^^^^^^^^^^^^^^^^
  File "/c/packages/LLaMA-Factory/src/llamafactory/hparams/parser.py", line 147, in _parse_train_args
    return _parse_args(parser, args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/c/packages/LLaMA-Factory/src/llamafactory/hparams/parser.py", line 60, in _parse_args
    return parser.parse_yaml_file(os.path.abspath(sys.argv[1]))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/c/environments/llama_factory_uat/lib64/python3.11/site-packages/transformers/hf_argparser.py", line 436, in parse_yaml_fil$
    outputs = self.parse_dict(yaml.safe_load(Path(yaml_file).read_text()), allow_extra_keys=allow_extra_keys)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/c/environments/llama_factory_uat/lib64/python3.11/site-packages/transformers/hf_argparser.py", line 387, in parse_dict
    obj = dtype(**inputs)
          ^^^^^^^^^^^^^^^
TypeError: Seq2SeqTrainingArguments.__init__() missing 1 required positional argument: 'output_dir'
Traceback (most recent call last):
  File "/c/packages/LLaMA-Factory/src/llamafactory/launcher.py", line 23, in <module>
    launch()
  File "/c/packages/LLaMA-Factory/src/llamafactory/launcher.py", line 19, in launch
    run_exp()
  File "/c/packages/LLaMA-Factory/src/llamafactory/train/tuner.py", line 45, in run_exp
    model_args, data_args, training_args, finetuning_args, generating_args = get_train_args(args)
                                                                             ^^^^^^^^^^^^^^^^^^^^
  File "/c/packages/LLaMA-Factory/src/llamafactory/hparams/parser.py", line 161, in get_train_args
    model_args, data_args, training_args, finetuning_args, generating_args = _parse_train_args(args)
                                                                             ^^^^^^^^^^^^^^^^^^^^^^^
  File "/c/packages/LLaMA-Factory/src/llamafactory/hparams/parser.py", line 147, in _parse_train_args
    return _parse_args(parser, args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/c/packages/LLaMA-Factory/src/llamafactory/hparams/parser.py", line 60, in _parse_args
    return parser.parse_yaml_file(os.path.abspath(sys.argv[1]))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/c/environments/llama_factory_uat/lib64/python3.11/site-packages/transformers/hf_argparser.py", line 436, in parse_yaml_file
    outputs = self.parse_dict(yaml.safe_load(Path(yaml_file).read_text()), allow_extra_keys=allow_extra_keys)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/c/environments/llama_factory_uat/lib64/python3.11/site-packages/transformers/hf_argparser.py", line 387, in parse_dict
    obj = dtype(**inputs)
          ^^^^^^^^^^^^^^^
TypeError: Seq2SeqTrainingArguments.__init__() missing 1 required positional argument: 'output_dir'
E1220 13:52:05.958000 2894600 torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: 1) local_rank: 0 (pid: 2894617) of binar
y: /c/environments/llama_factory_uat/bin/python3.11
Traceback (most recent call last):
  File "/c/environments/llama_factory_uat/bin/torchrun", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/c/environments/llama_factory_uat/lib64/python3.11/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.p
y", line 355, in wrapper
    return f(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^
  File "/c/environments/llama_factory_uat/lib64/python3.11/site-packages/torch/distributed/run.py", line 919, in main
    run(args)
  File "/c/environments/llama_factory_uat/lib64/python3.11/site-packages/torch/distributed/run.py", line 910, in run
    elastic_launch(
  File "/c/environments/llama_factory_uat/lib64/python3.11/site-packages/torch/distributed/launcher/api.py", line 138, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/c/environments/llama_factory_uat/lib64/python3.11/site-packages/torch/distributed/launcher/api.py", line 269, in launch_age
nt
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
/c/packages/LLaMA-Factory/src/llamafactory/launcher.py FAILED

paolovic · 2024-12-20T13:06:35Z

FYI: The same happened using an already quantized model https://huggingface.co/unsloth/Llama-3.3-70B-Instruct-bnb-4bit and I described it this github issue
#6391

hiyouga · 2024-12-21T05:32:46Z

c6e3c14 , not ffbb4db

github-actions bot added the pending This problem is yet to be addressed label Dec 20, 2024

hiyouga closed this as completed Dec 21, 2024

hiyouga added solved This problem has been already solved and removed pending This problem is yet to be addressed labels Dec 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ValueError: The following `model_kwargs` are not used by the model: ['skip_special_tokens'] #6407

ValueError: The following `model_kwargs` are not used by the model: ['skip_special_tokens'] #6407

paolovic commented Dec 20, 2024

paolovic commented Dec 20, 2024

paolovic commented Dec 20, 2024 •

edited

Loading

hiyouga commented Dec 21, 2024 •

edited

Loading

ValueError: The following model_kwargs are not used by the model: ['skip_special_tokens'] #6407

ValueError: The following model_kwargs are not used by the model: ['skip_special_tokens'] #6407

Comments

paolovic commented Dec 20, 2024

Reminder

System Info

Reproduction

Expected behavior

Others

paolovic commented Dec 20, 2024

paolovic commented Dec 20, 2024 • edited Loading

hiyouga commented Dec 21, 2024 • edited Loading

ValueError: The following `model_kwargs` are not used by the model: ['skip_special_tokens'] #6407

ValueError: The following `model_kwargs` are not used by the model: ['skip_special_tokens'] #6407

paolovic commented Dec 20, 2024 •

edited

Loading

hiyouga commented Dec 21, 2024 •

edited

Loading