Hugging Face from_pretrained() using merged weights KeyError: 'base_model_name_or_path' #2224

chg0901 · 2025-01-02T02:15:07Z

test codes in https://pytorch.org/torchtune/stable/tutorials/e2e_flow.html#use-with-hugging-face-from-pretrained


from transformers import AutoModelForCausalLM, AutoTokenizer
import transformers

print(transformers.__version__)

#TODO: update it to your chosen epoch
trained_model_path = "models/torchtune/llama3_2_3B/lora_single_device/epoch_1"
# trained_model_path = "/home/cine/Documents/tune/models/Llama-3.2-3B-Instruct"

model = AutoModelForCausalLM.from_pretrained(
    pretrained_model_name_or_path=trained_model_path,
)

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained(trained_model_path, safetensors=True)


# Function to generate text
def generate_text(model, tokenizer, prompt, max_length=50):
    inputs = tokenizer(prompt, return_tensors="pt")
    outputs = model.generate(**inputs, max_length=max_length)
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

prompt = "tell me a joke"
print("Base model output:", generate_text(model, tokenizer, prompt))

prompt = "Complete the sentence: 'Once upon a time..."
print("Base model output:", generate_text(model, tokenizer, prompt))

error

(base) cine@20211029-a04:~/Documents/tune$ /home/cine/miniconda3/envs/tune/bin/python /home/cine/Documents/tune/gen_from_merged_sft.py
Traceback (most recent call last):
  File "/home/cine/Documents/tune/gen_from_merged_sft.py", line 7, in <module>
    model = AutoModelForCausalLM.from_pretrained(
  File "/home/cine/miniconda3/envs/tune/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 514, in from_pretrained
    pretrained_model_name_or_path = adapter_config["base_model_name_or_path"]
KeyError: 'base_model_name_or_path'

but I can use peft to load the sftr model with

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

#TODO: update it to your chosen epoch
trained_model_path = "models/torchtune/llama3_2_3B/lora_single_device/epoch_1"

# Define the model and adapter paths
# # To Avoid this error, we can use local model
original_model_name = '/home/cine/Documents/tune/models/Llama-3.2-3B-Instruct'
model = AutoModelForCausalLM.from_pretrained(original_model_name)

# huggingface will look for adapter_model.safetensors and adapter_config.json
peft_model = PeftModel.from_pretrained(model, trained_model_path)

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained(original_model_name)

# Function to generate text
def generate_text(model, tokenizer, prompt, max_length=50):
    inputs = tokenizer(prompt, return_tensors="pt")
    outputs = model.generate(**inputs, max_length=max_length)
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

prompt = "tell me a joke: '"
print("Base model output:", generate_text(peft_model, tokenizer, prompt))

The text was updated successfully, but these errors were encountered:

felipemello1 · 2025-01-06T23:58:32Z

huggingface may be prioritizing reading from "adapter_config.json", instead of reading the model config. Maybe when i tested it, I tried it with full finetuning, instead of lora.

One sanity check is to remove or move adapter_model.safetensors and adapter_config.json files, to see if it defaults to the full model. I am on PTO this week, but i can look into it next week.

Ankur-singh · 2025-01-13T01:14:34Z

@chg0901 I'm not able to reproduce the error. For me it seems to work just fine. I might be missing something; can you please help me reproduce it?

chg0901 · 2025-01-13T01:18:29Z

ok, but how? I will try my best to assist you if you could specify what should I do Ankur Singh ***@***.***> 于 2025年1月13日周一 10:14写道：

…

@chg0901 <https://github.com/chg0901> I'm not able to reproduce the error. For me it seems to work just fine. I might be missing something; can you please help me reproduce it? — Reply to this email directly, view it on GitHub <#2224 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AB636WF27O7YNNWYBUSBZR32KMHRBAVCNFSM6AAAAABUPGQJ3WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKOBWGAYTMNZZGQ> . You are receiving this because you were mentioned.Message ID: ***@***.***>

Ankur-singh · 2025-01-13T13:58:11Z

Will it be possible to share a colab notebook with all the code to reproduce the error?

chg0901 · 2025-01-13T14:11:53Z

https://github.com/chg0901/hands_on_torchtune Please check this repo The blog is written in Chinese, but I think maybe you could use translator to read it. Have a good day Ankur Singh ***@***.***> 于 2025年1月13日周一 22:58写道：

…

Will it be possible to share a colab notebook with all the code to reproduce the error? — Reply to this email directly, view it on GitHub <#2224 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AB636WGEGLNH757CO33XWAL2KPBATAVCNFSM6AAAAABUPGQJ3WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKOBXGE3TOMBXHA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

joecummings added triaged This issue has been assigned an owner and appropriate label bug Something isn't working labels Jan 6, 2025

joecummings assigned felipemello1 and joecummings Jan 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hugging Face from_pretrained() using merged weights KeyError: 'base_model_name_or_path' #2224

Hugging Face from_pretrained() using merged weights KeyError: 'base_model_name_or_path' #2224

chg0901 commented Jan 2, 2025

felipemello1 commented Jan 6, 2025 •

edited

Loading

Ankur-singh commented Jan 13, 2025

chg0901 commented Jan 13, 2025 via email

Ankur-singh commented Jan 13, 2025

chg0901 commented Jan 13, 2025 via email

Hugging Face from_pretrained() using merged weights KeyError: 'base_model_name_or_path' #2224

Hugging Face from_pretrained() using merged weights KeyError: 'base_model_name_or_path' #2224

Comments

chg0901 commented Jan 2, 2025

error

felipemello1 commented Jan 6, 2025 • edited Loading

Ankur-singh commented Jan 13, 2025

chg0901 commented Jan 13, 2025 via email

Ankur-singh commented Jan 13, 2025

chg0901 commented Jan 13, 2025 via email

felipemello1 commented Jan 6, 2025 •

edited

Loading