Skip to content

Comments

removed duplication of mdp_json_path in compilation command (#706)#779

Open
ochougul wants to merge 2 commits intomainfrom
706_cherry
Open

removed duplication of mdp_json_path in compilation command (#706)#779
ochougul wants to merge 2 commits intomainfrom
706_cherry

Conversation

@ochougul
Copy link
Contributor

@ochougul ochougul commented Feb 5, 2026

Needed for passing custom config via vllm.


ochougul and others added 2 commits February 5, 2026 11:01
Needed for passing custom config via vllm.

---------

Signed-off-by: Onkar Chougule <ochougul@qti.qualcomm.com>
Signed-off-by: Mamta Singh <mamtsing@qti.qualcomm.com>
Co-authored-by: Mamta Singh <mamtsing@qti.qualcomm.com>
Signed-off-by: Onkar Chougule <ochougul@qti.qualcomm.com>
)

if prefill_only is None or not prefill_only:
if (prefill_only is None or not prefill_only) and prefill_seq_len != 1:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we set the prefill_only to false as default value?

if decode_spec:
specializations.append(decode_spec)

if kw_spec := compiler_options.pop("specializations", None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This can be simplified to
if "specializations" in compiler_options:
specializations = compiler_options.pop("specializations")

from QEfficient.transformers.quantizers import replace_transformers_quantizers, undo_transformers_quantizers

model_id = "openai/gpt-oss-120b" # weights are not required to convert to fp32
model_id = "openai/gpt-oss-20b" # weights are not required to convert to fp32
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use dummy config instead of full model

from QEfficient.transformers.quantizers import replace_transformers_quantizers, undo_transformers_quantizers

model_id = "openai/gpt-oss-120b" # weights are not required to convert to fp32
model_id = "openai/gpt-oss-20b" # weights are not required to convert to fp32
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use dummy config instead of full model

def test_disagg_mode_prefill_only_and_decode_only(model_id, prompt):
# Run prefill for original pytorch model
tokenizer = AutoTokenizer.from_pretrained(model_id)
PREFILL_SEQ_LEN = 256
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can define the constants in the dummy config. tests/transformers/models/custom_tiny_model_configs.json

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants