Onboarding Qwen3VL Dense by qcdipankar · Pull Request #780 · quic/efficient-transformers

qcdipankar · 2026-02-06T07:32:15Z

Adding Qwen3VL Support to QEff

QEfficient/transformers/models/pytorch_transforms.py

anujgupt-github · 2026-02-10T06:13:20Z

pyproject.toml

 requires-python = ">=3.8,<3.11"
 dependencies = [
-    "transformers==4.55.0",
+    "transformers==4.57.0",


@quic-rishinr / @quic-hemagnih : can we trigger TA?

Yes we should raise it, and start the run of all the models with 4.57 in parallel, typically it takes 1week.

QEfficient/transformers/models/qwen3_vl/modeling_qwen3_vl.py

anujgupt-github · 2026-02-10T06:17:35Z

QEfficient/transformers/models/qwen3_vl/modeling_qwen3_vl.py

+            attention_mask, torch.tensor(MIN_MASKED_ATTENTION_VALUE, dtype=torch.float32), attn_weights
+        )
+
+    attn_weights = nn.functional.softmax(attn_weights, dim=-1, dtype=torch.float32).to(query.dtype)


can you set this to dtype passed from pretrained()

QEfficient/transformers/models/modeling_auto.py

quic-hemagnih

I am still reviewing the modelling file.

examples/qwen3_vl/qwen3_vl.py

quic-hemagnih · 2026-02-10T07:38:16Z

examples/qwen3_vl/qwen3_vl.py

+
+    messages = [messages] * batch_size
+
+    inputs = processor.apply_chat_template(


I think we can combine the code from line 62 to 77 and 122 to 140 at one place.

Idea is to avoid the code repetition.

this we can discuss

QEfficient/transformers/models/qwen3_vl/modeling_qwen3_vl.py

quic-hemagnih · 2026-02-10T07:48:51Z

pyproject.toml

 requires-python = ">=3.8,<3.11"
 dependencies = [
-    "transformers==4.55.0",
+    "transformers==4.57.0",


Yes we should raise it, and start the run of all the models with 4.57 in parallel, typically it takes 1week.

QEfficient/transformers/models/pytorch_transforms.py

QEfficient/transformers/models/qwen3_vl/modeling_qwen3_vl.py

Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>

Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>

Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>

quic-xiyushi · 2026-02-20T02:01:07Z

QEfficient/transformers/models/pytorch_transforms.py

        QEffMptForCausalLM,
        QEffPhi3ForCausalLM,
        QEffQwen2ForCausalLM,
        QEffQwen_2_5_vl_DecoderWrapper,


Could you add QEffQwen3VLDecoderWrapper here under SamplerTransform? The on-device sampling is generic, so it can support new VLMs. Thank you.

If not, we can also raise a new patch @quic-sanising

qcdipankar requested review from ochougul, quic-amitraj, quic-hemagnih and quic-rishinr as code owners February 6, 2026 07:32

qcdipankar assigned qcdipankar and tv-karthikeya Feb 6, 2026

qcdipankar force-pushed the qwen3_vl branch from 747ddee to bd2c354 Compare February 7, 2026 09:04

anujgupt-github requested changes Feb 10, 2026

View reviewed changes

quic-hemagnih requested changes Feb 10, 2026

View reviewed changes

quic-xiyushi reviewed Feb 10, 2026

View reviewed changes

QEfficient/transformers/models/qwen3_vl/modeling_qwen3_vl.py Show resolved Hide resolved

quic-rishinr mentioned this pull request Feb 12, 2026

Onboarding Qwen3VlMoe #590

Draft

qcdipankar added the model-enablement label Feb 13, 2026

qcdipankar and others added 7 commits February 16, 2026 13:12

Onboarding Qwen3VL Dense

fd2f295

Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>

Minor Fix of Output Putting Rotary back to hf rotary

dc61d12

Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>

Fixed ros_embed and added multi vision config

1ccf45b

Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>

Updating the Min Pixel Calculations

240a0b3

Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>

Cleaning Code 3

b65399f

Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>

Removed breakpoints and commented code only.

330444a

Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>

Code Cleaning Done 1

b94ae73

Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>

qcdipankar force-pushed the qwen3_vl branch from dc95ccf to b94ae73 Compare February 16, 2026 07:42

qcdipankar assigned quic-dhirajku and unassigned tv-karthikeya Feb 16, 2026

qcdipankar added 4 commits February 18, 2026 19:33

Modified the qwen3vl multi config example script

db4d3b3

Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>

Added Continous batch script for qwen3vl

bddbf40

Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>

Added CB support for Qwen3_VL

4727b63

Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>

Code Cleaning Done 1

a398371

Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>

qcdipankar marked this pull request as draft February 19, 2026 09:16

quic-xiyushi reviewed Feb 20, 2026

View reviewed changes


		messages = [messages] * batch_size

		inputs = processor.apply_chat_template(

Conversation

qcdipankar commented Feb 6, 2026

Uh oh!

Uh oh!

anujgupt-github Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

quic-hemagnih Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

anujgupt-github Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

quic-hemagnih left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

quic-hemagnih Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

qcdipankar Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

quic-hemagnih Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

quic-xiyushi Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Comments