We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
when I use the LongAlpaca-12k dataset to supervised fintune the LongAlpaca-7B model, the value of loss is too unstable. my command is :
Miniconda/envs/longlora/bin/python -u supervised-fine-tune.py --model_name_or_path models/LongAlpaca-7B --bf16 True --output_dir LongLoRA/save/LongAlpaca-7B-origdata --model_max_length 32768 --use_flash_attn True --data_path data/LongAlpaca-12k.json --low_rank_training True --num_train_epochs 3 --per_device_train_batch_size 1 --per_device_eval_batch_size 2 --gradient_accumulation_steps 1 --evaluation_strategy no --save_strategy steps --save_steps 1000 --save_total_limit 2 --learning_rate 2e-5 --weight_decay 0.0 --warmup_steps 20 --lr_scheduler_type constant_with_warmup --logging_steps 1 --deepspeed ds_configs/stage2.json --tf32 True
the value of loss looks like below:
The text was updated successfully, but these errors were encountered:
I try to train Llama-2-7b-longlora-100k-ft with my own dataset which is sampled from your LongAlpaca-12k.json data. But the value of loss looks same.
python supervised-fine-tune.py \ --model_name_or_path /models/Llama-2-7b-longlora-100k-ft \ --bf16 True \ --output_dir LongLoRA/save/7b-100k-ft-origdata-mydata \ --model_max_length 100000 \ --use_flash_attn True \ --data_path LongLoRA/pdf2txt/output/manual_data.json \ --low_rank_training True \ --num_train_epochs 5 \ --per_device_train_batch_size 1 \ --per_device_eval_batch_size 2 \ --gradient_accumulation_steps 8 \ --evaluation_strategy "no" \ --save_strategy "steps" \ --save_steps 98 \ --save_total_limit 2 \ --learning_rate 2e-5 \ --weight_decay 0.0 \ --warmup_steps 20 \ --lr_scheduler_type "constant_with_warmup" \ --logging_steps 1 \ --deepspeed "ds_configs/stage2.json" \ --tf32 True
Sorry, something went wrong.
No branches or pull requests
when I use the LongAlpaca-12k dataset to supervised fintune the LongAlpaca-7B model, the value of loss is too unstable.
my command is :
the value of loss looks like below:
The text was updated successfully, but these errors were encountered: