You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After training, when saving the model, I got the following error. It seems that the relative path is not constructed correctly.
100%███████████████████████████████████████████| 1/1 [00:05<00:00, 5.78s/it][INFO|trainer.py:3801] 2024-12-19 20:23:29,329 >> Saving model checkpoint to ../../models/wildfeedback-december/phi-wildfeedback-gpt4o-sft/checkpoint-1
Traceback (most recent call last):
File "/app/src/llamafactory/launcher.py", line 23, in <module>
launch()
File "/app/src/llamafactory/launcher.py", line 19, in launch
run_exp()
File "/app/src/llamafactory/train/tuner.py", line 50, in run_exp
run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
File "/app/src/llamafactory/train/sft/workflow.py", line 101, in run_sft
train_result = trainer.train(resume_from_checkpoint=training_args.resume_from_checkpoint)
File "/opt/conda/envs/ptca/lib/python3.10/site-packages/transformers/trainer.py", line 2122, in train
return inner_training_loop(
File "/opt/conda/envs/ptca/lib/python3.10/site-packages/transformers/trainer.py", line 2541, in _inner_training_loop
self._maybe_log_save_evaluate(tr_loss, grad_norm, model, trial, epoch, ignore_keys_for_eval)
File "/opt/conda/envs/ptca/lib/python3.10/site-packages/transformers/trainer.py", line 3000, in _maybe_log_save_evaluate
self._save_checkpoint(model, trial, metrics=metrics)
File "/opt/conda/envs/ptca/lib/python3.10/site-packages/transformers/trainer.py", line 3090, in _save_checkpoint
self.save_model(output_dir, _internal_call=True)
File "/opt/conda/envs/ptca/lib/python3.10/site-packages/transformers/trainer.py", line 3706, in save_model
self._save(output_dir, state_dict=state_dict)
File "/opt/conda/envs/ptca/lib/python3.10/site-packages/transformers/trainer.py", line 3823, in _save
self.model.save_pretrained(
File "/opt/conda/envs/ptca/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2809, in save_pretrained
custom_object_save(self, save_directory, config=self.config)
File "/opt/conda/envs/ptca/lib/python3.10/site-packages/transformers/dynamic_module_utils.py", line 623, in custom_object_save
for needed_file in get_relative_import_files(object_file):
File "/opt/conda/envs/ptca/lib/python3.10/site-packages/transformers/dynamic_module_utils.py", line 128, in get_relative_import_files
new_imports.extend(get_relative_imports(f))
File "/opt/conda/envs/ptca/lib/python3.10/site-packages/transformers/dynamic_module_utils.py", line 97, in get_relative_imports
with open(module_file, "r", encoding="utf-8") as f:
FileNotFoundError: [Errno 2] No such file or directory: '/opt/conda/envs/ptca/lib/python3.10/site-packages/transformers/models/phi3/..generation.py'
Expected behavior
No response
Others
I used the same yaml config to train llama and qwen but received no error. I can also save the model using the following code, which supposedly uses the same save_pretrained method.
fromtransformersimportAutoModel, AutoTokenizer# Path to your modelmodel_path="../../models/Phi-3-mini-4k-instruct"# Update with your pathoutput_path="../../models/wildfeedback-december/phi-wildfeedback-gpt4o-sft"# Directory to save the modeltry:
# Load the model and tokenizermodel=AutoModel.from_pretrained(model_path)
tokenizer=AutoTokenizer.from_pretrained(model_path)
print("Model and tokenizer loaded successfully.")
# Save the model and tokenizermodel.save_pretrained(output_path)
tokenizer.save_pretrained(output_path)
print(f"Model and tokenizer saved successfully to {output_path}.")
exceptExceptionase:
print(f"An error occurred: {e}")
Could anyone please help? Thank you!
The text was updated successfully, but these errors were encountered:
I tried using absolute paths for both the model_name_or_path and output_dir, but it did not work. I can use the same script to train the LLaMA and Qwen models (also using relative paths) without any problems. The issue seems to be that Phi-3 uses a custom Python file for its model configuration (e.g., configuration_phi3.py), which triggers the get_relative_imports function in the Transformers library, resulting this error.
Do you have any other suggestions for this issue? Thanks!
Reminder
System Info
llamafactory
version: 0.9.2.dev0Reproduction
command:
phi3.yaml:
After training, when saving the model, I got the following error. It seems that the relative path is not constructed correctly.
Expected behavior
No response
Others
I used the same yaml config to train llama and qwen but received no error. I can also save the model using the following code, which supposedly uses the same
save_pretrained
method.Could anyone please help? Thank you!
The text was updated successfully, but these errors were encountered: