-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RuntimeError: element 0 of tensors.. OpenCLIP model #2200
Comments
Indeed, for |
Thanks for your reply and your work on this library. Below are the results I am seeing for different inputs to the "target_modules" arg : main branch: "attn": your branch: "attn": Trains! (Thank you) Q1)So I understand the consequences for my purpose, could I please clarify a few things? Q2) What's the main difference between |
To your questions:
|
System Info
peft = 0.13.2
python = 3.12.7
transformers = 4.45.2
Who can help?
@sayakpaul
I am using
inject_adapter_model(...)
to finetune a model from OpenCLIP using LoRA layers. I am able to finetune the model by modifyingLinear()
layers and other supported types as expected. However, there is a model that I am currently training that has an attention module called "out_proj" that has the following layer typeNonDynamicallyQuantizableLinear(Linear)
. I may be mistaken but from my understanding of the source code forNonDynamicallyQuantizableLinear
(https://github.com/pytorch/pytorch/blob/main/torch/nn/modules/linear.py#L136), I should be able to treat it as just a typicaltorch.nn.Linear
layer for my purposes. However, I always get the following error: "RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn". The lora layers are also added as expected.The layers are successfully added when I add it via target_modules and also if I use
register_custom_modules
with the following mappingtorch.nn.modules.linear.NonDynamicallyQuantizableLinear
->peft.tuners.lora.layer.Linear
. However, neither case trains. Furthermore, the model trains when I include any other layers e.g. a fully-connected one that's of typetorch.nn.Linear
.target_modules =
Any idea why this may be the case? Your help would be truly appreciated
Information
Tasks
examples
folderReproduction
Train step:
Model structure near a layer of interest:
('transformer.resblocks.11.attn', <class 'torch.nn.modules.activation.MultiheadAttention'>)
('transformer.resblocks.11.attn.out_proj', <class 'torch.nn.modules.linear.NonDynamicallyQuantizableLinear'>)
('transformer.resblocks.11.ls_1', <class 'torch.nn.modules.linear.Identity'>)
Injection code:
Expected behavior
I would expect it to begin training. Here are the first few print outs of atypical run
The text was updated successfully, but these errors were encountered: