load LoRA Adapter #1492

hessaAlawwad · 2025-01-01T07:08:07Z

Hello,

I have finetuned a model using unsloth tutorial (Llama 3.2 Vision finetuning - Radiography use case) then uplodaded the LoRA adapter to hf,
now I want to further train this adapter by loading it:

base_model = AutoModel.from_pretrained("meta-llama/Llama-3.2-11B-Vision-Instruct")
lora_model = PeftModel.from_pretrained(base_model, "Hessa/MMTQA_lora3")
lora_model.eval()

gives me error:

ValueError: Unrecognized configuration class <class 'transformers.models.mllama.configuration_mllama.MllamaConfig'> for this kind of AutoModel: AutoModel.
Model type should be one of AlbertConfig, AlignConfig, AltCLIPConfig, ASTConfig, AutoformerConfig, BarkConfig, BartConfig, BeitConfig, BertConfig, BertGenerationConfig, BigBirdConfig, BigBirdPegasusConfig, BioGptConfig, BitConfig, BlenderbotConfig, BlenderbotSmallConfig, BlipConfig, Blip2Config, BloomConfig, BridgeTowerConfig, BrosConfig, CamembertConfig, CanineConfig, ChameleonConfig, ChineseCLIPConfig, ChineseCLIPVisionConfig, ClapConfig, CLIPConfig, CLIPTextConfig, CLIPVisionConfig, CLIPSegConfig, ClvpConfig, LlamaConfig, CodeGenConfig, CohereConfig, ConditionalDetrConfig, ConvBertConfig, ConvNextConfig, ConvNextV2Config, CpmAntConfig, CTRLConfig, CvtConfig, DacConfig, Data2VecAudioConfig, Data2VecTextConfig, Data2VecVisionConfig, DbrxConfig, DebertaConfig, DebertaV2Config, DecisionTransformerConfig, DeformableDetrConfig, DeiTConfig, DetaConfig, DetrConfig, DinatConfig, Dinov2Config, DistilBertConfig, DonutSwinConfig, DPRConfig, DPTConfig, EfficientFormerConfig, EfficientNetConfig, ElectraConfig, EncodecConfig, ErnieConfig, ErnieMConfig, EsmConfig, FalconConfig, FalconMambaConfig, FastSpeech2ConformerConfig, FlaubertConfig, FlavaConfig, FNetConfig, FocalNetConfig, FSMTConfig, FunnelConfig, GemmaConfig, Gemma2Config, GitConfig, GlmConfig, GLPNConfig, GPT2Config, GPT2Config, GPTBigCodeConfig, GPTNeoConfig, GPTNeoXConfig, GPTNeoXJapaneseConfig, GPTJConfig, GPTSanJapaneseConfig, GraniteConfig, GraniteMoeConfig, GraphormerConfig, GroundingDinoConfig, GroupViTConfig, HieraCon...

The text was updated successfully, but these errors were encountered:

KareemMusleh · 2025-01-01T07:32:29Z

Try using AutoModelForCausalLM.from_pretrained instead

hessaAlawwad · 2025-01-01T07:52:02Z

thank you for your reply @KareemMusleh. I got the following error:

/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_auth.py:94: UserWarning: 
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.
  warnings.warn(
Unused kwargs: ['_load_in_4bit', '_load_in_8bit', 'quant_method']. These kwargs are not used in <class 'transformers.utils.quantization_config.BitsAndBytesConfig'>.
`low_cpu_mem_usage` was None, now default to True since model is quantized.
Loading checkpoint shards:   0%
 0/2 [00:00<?, ?it/s]
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
[<ipython-input-9-9b45f1afa1ae>](https://localhost:8080/#) in <cell line: 4>()
      2 from transformers import AutoModelForCausalLM
      3 
----> 4 lora_model = AutoModelForCausalLM.from_pretrained("Hessa/MMTQA_lora3", torch_dtype=torch.float16)

4 frames
[/usr/local/lib/python3.10/dist-packages/transformers/quantizers/quantizer_bnb_4bit.py](https://localhost:8080/#) in create_quantized_param(self, model, param_value, param_name, target_device, state_dict, unexpected_keys)
    205                 param_name + ".quant_state.bitsandbytes__nf4" not in state_dict
    206             ):
--> 207                 raise ValueError(
    208                     f"Supplied state dict for {param_name} does not contain `bitsandbytes__*` and possibly other `quantized_stats` components."
    209                 )

ValueError: Supplied state dict for model.layers.0.mlp.down_proj.weight does not contain `bitsandbytes__*` and possibly other `quantized_stats` components.

mosama1994 · 2025-01-01T15:46:30Z

can you share your adapter_config.json? You must have trained after quantization right? Then open the config and check "base_model_name_or_path" key. Set this to the base model you want to merge the adapter to. Unsloth hugging face repo also has the unquantized models either from there or the original repo, from there you can use for the base model.

Can you also try the following:

from peft import AutoPeftModelForCausalLM
import torch

peft_model = AutoPeftModelForCausalLM.from_pretrained(
    "this is the path to the adapter",
    low_cpu_mem_usage=True,
    torch_dtype=torch.bfloat16
)
merged_model = peft_model.merge_and_unload()

Now merged_model will be your final merged model that you can save and also use.

hessaAlawwad · 2025-01-01T15:59:29Z

thank you very much @mosama1994
I wanted to load the lora_model to further train it so I di not want to merge it with the base model.
Is it possible?

mosama1994 · 2025-01-01T16:12:33Z

You will have to merge and do that i believe. Right now, when you will create a LoRA adapter, it will be created with random weights. So, if you dont merge, you should be able to use that for finetuning actually. I have never tried that. So, the peft_model, that should be further trainable, I believe. Test it out and let me know as well.

mosama1994 · 2025-01-01T16:33:18Z

However, if you will create the peft model like this for finetuning, it will not be optimized by unsloth. Because when you do FastLanguageModel.get_peft_model from unsloth the also patches some layers to make it faster.

mosama1994 · 2025-01-06T00:46:31Z

Hi Hessa,

I found out how you can load your LoRA adapter for further tuning. The following is the code:

from unsloth import FastLanguageModel
import torch

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "mosama/Qwen2.5-0.5B-Pretraining-ar-eng-urd-LoRA-Adapters", # YOUR PEFT MODEL PATH WILL COME HERE
    max_seq_length = 2048,
    dtype = torch.bfloat16,
    load_in_4bit = True,
)

# NOW WHEN YOU WILL DO THIS STEP, IT WILL SAY THAT YOU ALREADY HAVE A PEFT MODEL AND YOU CAN GO ON WITH THE REST OF YOUR CODE, PROBLEM IS THERE IS A SMALL BUG IN UNSLOTH CODE FOR WHICH I HAVE OPENED A PULL REQUEST TO RESOLVE. PULL REQUEST IS BELOW. YOU CAN SET THE SAME CONFIG THAT THE MODEL WAS LOADED WITH PREVIOUSLY WHEN YOU TRAINED.
model = FastLanguageModel.get_peft_model(
    model,
    r = 32, # Choose any number > 0 ! Suggested 8, 16, 32, 64, 128
    target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
                      "gate_proj", "up_proj", "down_proj", 
                      "embed_tokens","lm_head" # YOU CAN REMOVE THESE, THESE ARE WHAT I ADDED FOR PRETRAINING A MODEL
                      ],
    lora_alpha = 16,
    lora_dropout = 0, # Supports any, but = 0 is optimized
    bias = "none",    # Supports any, but = "none" is optimized
    # [NEW] "unsloth" uses 30% less VRAM, fits 2x larger batch sizes!
    use_gradient_checkpointing = "unsloth", # True or "unsloth" for very long context
    random_state = 3407,
    use_rslora = True,  # We support rank stabilized LoRA
    loftq_config = None, # And LoftQ
)

PULL REQUEST:
#1509

hessaAlawwad · 2025-01-06T07:49:36Z

thank you @mosama1994

So I can use model (the lora adapter) for further training without this step?

model = FastLanguageModel.get_peft_model(
    model,
    r = 32, # Choose any number > 0 ! Suggested 8, 16, 32, 64, 128
    target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
                      "gate_proj", "up_proj", "down_proj", 
                      "embed_tokens","lm_head" # YOU CAN REMOVE THESE, THESE ARE WHAT I ADDED FOR PRETRAINING A MODEL
                      ],
    lora_alpha = 16,
    lora_dropout = 0, # Supports any, but = 0 is optimized
    bias = "none",    # Supports any, but = "none" is optimized
    # [NEW] "unsloth" uses 30% less VRAM, fits 2x larger batch sizes!
    use_gradient_checkpointing = "unsloth", # True or "unsloth" for very long context
    random_state = 3407,
    use_rslora = True,  # We support rank stabilized LoRA
    loftq_config = None, # And LoftQ
)

mosama1994 · 2025-01-06T10:39:26Z

You have to do this step as this will load the model with the trained LORA adapter that you gave the path to properly. Just change the settings to your LORA adapter that you used in the first training. The pull request i created you, there is a llama.py file in unsloth models folder which needs changes. But i believe you are training vision model so check after these two steps if you get any error and let me know.

The first part actually loads the PEFT model but yeah the second step basically check the config and make the model ready for training. So, need to run both same as last time when you trained, just in the first part change the path to the LoRA checkpoint. Also, after these 2 steps run this:

model = model.to("cuda:0")

hessaAlawwad · 2025-01-06T20:09:09Z

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
[<ipython-input-16-f0f542f37d53>](https://localhost:8080/#) in <cell line: 1>()
----> 1 model = FastVisionModel.get_peft_model(
      2     model,
      3     finetune_vision_layers     = True, # False if not finetuning vision layers
      4     finetune_language_layers   = True, # False if not finetuning language layers
      5     finetune_attention_modules = True, # False if not finetuning attention layers

[/usr/local/lib/python3.10/dist-packages/unsloth/models/vision.py](https://localhost:8080/#) in get_peft_model(model, r, target_modules, lora_alpha, lora_dropout, bias, finetune_vision_layers, finetune_language_layers, finetune_attention_modules, finetune_mlp_modules, layers_to_transform, layers_pattern, use_gradient_checkpointing, random_state, max_seq_length, use_rslora, modules_to_save, init_lora_weights, loftq_config, temporary_location, **kwargs)
    237 
    238         if isinstance(model, PeftModelForCausalLM):
--> 239             raise RuntimeError("Unsloth: You already added LoRA adapters to your model!")
    240 
    241         if target_modules == "all-linear":

RuntimeError: Unsloth: You already added LoRA adapters to your model!

hessaAlawwad · 2025-01-06T20:10:03Z

those are the peft configuration:

model = FastVisionModel.get_peft_model(
model,
finetune_vision_layers = True, # False if not finetuning vision layers
finetune_language_layers = True, # False if not finetuning language layers
finetune_attention_modules = True, # False if not finetuning attention layers
finetune_mlp_modules = True, # False if not finetuning MLP layers

r = 16,           # The larger, the higher the accuracy, but might overfit
lora_alpha = 16,  # Recommended alpha == r at least
lora_dropout = 0,
bias = "none",
random_state = 3407,
use_rslora = False,  # We support rank stabilized LoRA
loftq_config = None, # And LoftQ
# target_modules = "all-linear", # Optional now! Can specify a list if needed

)

mosama1994 · 2025-01-06T21:14:56Z

Yes, then in this case just keep the first step. Then run:

model.print_trainable_parameters() # This is just to check that indeed we are using less parameters it will print number and percentage.
model = model.to("cuda:0")
# Then the rest of your script, skipping the get_peft_model

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

load LoRA Adapter #1492

load LoRA Adapter #1492

hessaAlawwad commented Jan 1, 2025

KareemMusleh commented Jan 1, 2025 •

edited

Loading

hessaAlawwad commented Jan 1, 2025

mosama1994 commented Jan 1, 2025 •

edited

Loading

hessaAlawwad commented Jan 1, 2025

mosama1994 commented Jan 1, 2025

mosama1994 commented Jan 1, 2025

mosama1994 commented Jan 6, 2025 •

edited

Loading

hessaAlawwad commented Jan 6, 2025

mosama1994 commented Jan 6, 2025 •

edited

Loading

hessaAlawwad commented Jan 6, 2025

hessaAlawwad commented Jan 6, 2025

mosama1994 commented Jan 6, 2025

load LoRA Adapter #1492

load LoRA Adapter #1492

Comments

hessaAlawwad commented Jan 1, 2025

KareemMusleh commented Jan 1, 2025 • edited Loading

hessaAlawwad commented Jan 1, 2025

mosama1994 commented Jan 1, 2025 • edited Loading

Now merged_model will be your final merged model that you can save and also use.

hessaAlawwad commented Jan 1, 2025

mosama1994 commented Jan 1, 2025

mosama1994 commented Jan 1, 2025

mosama1994 commented Jan 6, 2025 • edited Loading

hessaAlawwad commented Jan 6, 2025

mosama1994 commented Jan 6, 2025 • edited Loading

hessaAlawwad commented Jan 6, 2025

hessaAlawwad commented Jan 6, 2025

mosama1994 commented Jan 6, 2025

KareemMusleh commented Jan 1, 2025 •

edited

Loading

mosama1994 commented Jan 1, 2025 •

edited

Loading

mosama1994 commented Jan 6, 2025 •

edited

Loading

mosama1994 commented Jan 6, 2025 •

edited

Loading