removed duplication of `mdp_json_path` in compilation command (#706) by ochougul · Pull Request #779 · quic/efficient-transformers

ochougul · 2026-02-05T11:09:24Z

Needed for passing custom config via vllm.

Needed for passing custom config via vllm. --------- Signed-off-by: Onkar Chougule <ochougul@qti.qualcomm.com> Signed-off-by: Mamta Singh <mamtsing@qti.qualcomm.com> Co-authored-by: Mamta Singh <mamtsing@qti.qualcomm.com>

Signed-off-by: Onkar Chougule <ochougul@qti.qualcomm.com>

quic-rishinr · 2026-02-13T08:36:19Z

QEfficient/transformers/models/modeling_auto.py

                )

-        if prefill_only is None or not prefill_only:
+        if (prefill_only is None or not prefill_only) and prefill_seq_len != 1:


Can we set the prefill_only to false as default value?

quic-rishinr · 2026-02-13T08:37:56Z

QEfficient/transformers/models/modeling_auto.py

                if decode_spec:
                    specializations.append(decode_spec)

+        if kw_spec := compiler_options.pop("specializations", None):


nit: This can be simplified to
if "specializations" in compiler_options:
specializations = compiler_options.pop("specializations")

quic-rishinr · 2026-02-13T08:39:47Z

tests/transformers/models/test_disagg_mode.py

 from QEfficient.transformers.quantizers import replace_transformers_quantizers, undo_transformers_quantizers

-model_id = "openai/gpt-oss-120b"  # weights are not required to convert to fp32
+model_id = "openai/gpt-oss-20b"  # weights are not required to convert to fp32


Use dummy config instead of full model

quic-rishinr · 2026-02-13T08:39:47Z

tests/transformers/models/test_disagg_mode.py

 from QEfficient.transformers.quantizers import replace_transformers_quantizers, undo_transformers_quantizers

-model_id = "openai/gpt-oss-120b"  # weights are not required to convert to fp32
+model_id = "openai/gpt-oss-20b"  # weights are not required to convert to fp32


Use dummy config instead of full model

quic-rishinr · 2026-02-13T08:40:45Z

tests/transformers/models/test_disagg_mode.py

+def test_disagg_mode_prefill_only_and_decode_only(model_id, prompt):
+    # Run prefill for original pytorch model
+    tokenizer = AutoTokenizer.from_pretrained(model_id)
+    PREFILL_SEQ_LEN = 256


you can define the constants in the dummy config. tests/transformers/models/custom_tiny_model_configs.json

ochougul and others added 2 commits February 5, 2026 11:01

minor fix

b8cd36f

Signed-off-by: Onkar Chougule <ochougul@qti.qualcomm.com>

ochougul requested review from quic-amitraj, quic-hemagnih and quic-rishinr as code owners February 5, 2026 11:09

quic-rishinr requested changes Feb 13, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

removed duplication of `mdp_json_path` in compilation command (#706)#779

removed duplication of `mdp_json_path` in compilation command (#706)#779
ochougul wants to merge 2 commits intomainfrom
706_cherry

ochougul commented Feb 5, 2026

Uh oh!

quic-rishinr Feb 13, 2026

Uh oh!

quic-rishinr Feb 13, 2026

Uh oh!

quic-rishinr Feb 13, 2026

Uh oh!

quic-rishinr Feb 13, 2026

Uh oh!

quic-rishinr Feb 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

ochougul commented Feb 5, 2026

Uh oh!

quic-rishinr Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

quic-rishinr Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

quic-rishinr Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

quic-rishinr Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

quic-rishinr Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants