-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
load LoRA Adapter #1492
Comments
Try using AutoModelForCausalLM.from_pretrained instead |
thank you for your reply @KareemMusleh. I got the following error:
|
can you share your adapter_config.json? You must have trained after quantization right? Then open the config and check "base_model_name_or_path" key. Set this to the base model you want to merge the adapter to. Unsloth hugging face repo also has the unquantized models either from there or the original repo, from there you can use for the base model. Can you also try the following: from peft import AutoPeftModelForCausalLM
import torch
peft_model = AutoPeftModelForCausalLM.from_pretrained(
"this is the path to the adapter",
low_cpu_mem_usage=True,
torch_dtype=torch.bfloat16
)
merged_model = peft_model.merge_and_unload() Now merged_model will be your final merged model that you can save and also use. |
thank you very much @mosama1994 |
You will have to merge and do that i believe. Right now, when you will create a LoRA adapter, it will be created with random weights. So, if you dont merge, you should be able to use that for finetuning actually. I have never tried that. So, the peft_model, that should be further trainable, I believe. Test it out and let me know as well. |
However, if you will create the peft model like this for finetuning, it will not be optimized by unsloth. Because when you do FastLanguageModel.get_peft_model from unsloth the also patches some layers to make it faster. |
Hi Hessa, I found out how you can load your LoRA adapter for further tuning. The following is the code: from unsloth import FastLanguageModel
import torch
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "mosama/Qwen2.5-0.5B-Pretraining-ar-eng-urd-LoRA-Adapters", # YOUR PEFT MODEL PATH WILL COME HERE
max_seq_length = 2048,
dtype = torch.bfloat16,
load_in_4bit = True,
)
# NOW WHEN YOU WILL DO THIS STEP, IT WILL SAY THAT YOU ALREADY HAVE A PEFT MODEL AND YOU CAN GO ON WITH THE REST OF YOUR CODE, PROBLEM IS THERE IS A SMALL BUG IN UNSLOTH CODE FOR WHICH I HAVE OPENED A PULL REQUEST TO RESOLVE. PULL REQUEST IS BELOW. YOU CAN SET THE SAME CONFIG THAT THE MODEL WAS LOADED WITH PREVIOUSLY WHEN YOU TRAINED.
model = FastLanguageModel.get_peft_model(
model,
r = 32, # Choose any number > 0 ! Suggested 8, 16, 32, 64, 128
target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
"gate_proj", "up_proj", "down_proj",
"embed_tokens","lm_head" # YOU CAN REMOVE THESE, THESE ARE WHAT I ADDED FOR PRETRAINING A MODEL
],
lora_alpha = 16,
lora_dropout = 0, # Supports any, but = 0 is optimized
bias = "none", # Supports any, but = "none" is optimized
# [NEW] "unsloth" uses 30% less VRAM, fits 2x larger batch sizes!
use_gradient_checkpointing = "unsloth", # True or "unsloth" for very long context
random_state = 3407,
use_rslora = True, # We support rank stabilized LoRA
loftq_config = None, # And LoftQ
) PULL REQUEST: |
thank you @mosama1994 So I can use model (the lora adapter) for further training without this step?
|
You have to do this step as this will load the model with the trained LORA adapter that you gave the path to properly. Just change the settings to your LORA adapter that you used in the first training. The pull request i created you, there is a llama.py file in unsloth models folder which needs changes. But i believe you are training vision model so check after these two steps if you get any error and let me know. The first part actually loads the PEFT model but yeah the second step basically check the config and make the model ready for training. So, need to run both same as last time when you trained, just in the first part change the path to the LoRA checkpoint. Also, after these 2 steps run this: model = model.to("cuda:0") |
|
those are the peft configuration: model = FastVisionModel.get_peft_model(
) |
Yes, then in this case just keep the first step. Then run: model.print_trainable_parameters() # This is just to check that indeed we are using less parameters it will print number and percentage.
model = model.to("cuda:0")
# Then the rest of your script, skipping the get_peft_model |
Hello,
I have finetuned a model using unsloth tutorial (Llama 3.2 Vision finetuning - Radiography use case) then uplodaded the LoRA adapter to hf,
now I want to further train this adapter by loading it:
gives me error:
The text was updated successfully, but these errors were encountered: