Skip to content

Fixing the issue of CCL support during the decoding phase of Disaggregated Serving#776

Open
vjanfaza wants to merge 4 commits intoquic:mainfrom
vjanfaza:CCL-main-v1.21
Open

Fixing the issue of CCL support during the decoding phase of Disaggregated Serving#776
vjanfaza wants to merge 4 commits intoquic:mainfrom
vjanfaza:CCL-main-v1.21

Conversation

@vjanfaza
Copy link
Contributor

@vjanfaza vjanfaza commented Feb 5, 2026

In this PR, we are addressing the compilation error which is happening when we enable CCL during decoding qpc generation of gpt-oss model in Disaggregated Serving. For example, in the following command:
python3 -m qaic_disagg
--prefill-port 9802
--decode-port 9902
--port 8002
--decode-device-group 16,17,18,19
--prefill-device-group 20,21,22,23
--model openai/gpt-oss-20b
--prefill-max-num-seqs 1
--decode-max-num-seqs 1
--prefill-max-seq-len-to-capture 128
--max-model-len 4096
--prefill-override-qaic-config "split_retained_state_io:True mxfp6_matmul:True enable_chunking:True"
--decode-override-qaic-config "mxfp6_matmul:True retain_full_kv:True ccl_enabled=True comp_ctx_lengths_decode=1024,2048,4096"
-vvv
--dtype bfloat16
--kv-cache-dtype mxint8
--kv-handOff-port 5068
--tool-call-parser openai
--enable-auto-tool-choice
--enable-log-outputs

We are activating CCL during decoding however this causes a compilation error "Error message: No input that uniquely identifies specialization". The source of this error is because of new changes in modeling_gpt_oss.py script which were for the support of disaggregated serving in gpt-oss however it causes error with CCL feature.

@quic-rishinr
Copy link
Contributor

@ochougul Can you please review this?

@quic-rishinr quic-rishinr force-pushed the CCL-main-v1.21 branch 3 times, most recently from 819910f to 431c4c5 Compare February 13, 2026 04:04
…gated Serving and also adding the CCL support during Prefilling process

Signed-off-by: Vahid Janfaza <vjanfaza@qti.qualcomm.com>
…gated Serving and also adding the CCL support during Prefilling process

Signed-off-by: Vahid Janfaza <vjanfaza@qti.qualcomm.com>
…gated Serving and also adding the CCL support during Prefilling process

Signed-off-by: Vahid Janfaza <vjanfaza@qti.qualcomm.com>
…gated Serving

Signed-off-by: Vahid Janfaza <vjanfaza@qti.qualcomm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments