微调后的模型无法输出特殊token #6345

THUchenzhou · 2024-12-16T06:47:42Z

任务：使用大模型做简单的文本二分类。
使用如下命令添加special_tokens，微调qwen
CUDA_VISIBLE_DEVICES=0,1 llamafactory-cli train examples/train_lora/qwen2_special_token.yaml

model_name_or_path: /data/Qwen/Qwen2.5-0.5B-Instruct

stage: sft
do_train: true
finetuning_type: lora
lora_target: all
deepspeed: examples/deepspeed/ds_z0_config.json
lora_rank: 16
lora_alpha: 32

dataset: special_token
template: qwen
cutoff_len: 4096
overwrite_cache: true
preprocessing_num_workers: 16

output_dir: saves/lora/sft/12_general/qwen25-0.5b/tag_15761_wo_lora_16_lr_500_special_token
logging_steps: 10
save_steps: 6000
plot_loss: true
overwrite_output_dir: true

per_device_train_batch_size: 4
gradient_accumulation_steps: 1
learning_rate: 5.00e-4
num_train_epochs: 1
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: true
ddp_timeout: 180000000
new_special_tokens: "<|LABEL_0|>,<|LABEL_1|>"
resize_vocab: true
additional_target: embed_tokens,lm_head

启动服务CUDA_VISIBLE_DEVICES=0 API_PORT=8000 llamafactory-cli api examples/inference/qwen2.yaml，文件内容如下：

model_name_or_path:/data/Qwen/Qwen2.5-0.5B-Instruct
adapter_name_or_path: saves/lora/sft/12_general/qwen25-0.5b/tag_15761_wo_lora_16_lr_500_special_token
template: qwen
finetuning_type: lora

问题：
启动服务后，请求微调后的模型，无法输出新添加的特殊token。请问是训练参数设置错误，还是启动服务的方式错误？

The text was updated successfully, but these errors were encountered:

hiyouga · 2024-12-17T11:36:07Z

添加参数 skip_special_tokens: true

github-actions bot added the pending This problem is yet to be addressed label Dec 16, 2024

hiyouga added a commit that referenced this issue Dec 17, 2024

support control eos, fix #6345

eda76de

hiyouga mentioned this issue Dec 17, 2024

[infer] support control eos #6363

Merged

2 tasks

hiyouga closed this as completed in #6363 Dec 17, 2024

hiyouga added solved This problem has been already solved and removed pending This problem is yet to be addressed labels Dec 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

微调后的模型无法输出特殊token #6345

微调后的模型无法输出特殊token #6345

THUchenzhou commented Dec 16, 2024

hiyouga commented Dec 17, 2024

微调后的模型无法输出特殊token #6345

微调后的模型无法输出特殊token #6345

Comments

THUchenzhou commented Dec 16, 2024

hiyouga commented Dec 17, 2024