Skip to content

feat: Add HybridEP support for MoE expert parallelism#1942

Open
seonjinn wants to merge 4 commits intomainfrom
sj/hybridep-support
Open

feat: Add HybridEP support for MoE expert parallelism#1942
seonjinn wants to merge 4 commits intomainfrom
sj/hybridep-support

Conversation

@seonjinn
Copy link
Contributor

@seonjinn seonjinn commented Feb 13, 2026

  • Update DeepEP dependency to hybrid-ep branch for HybridEP support
    • automodel, vllm, mcore dependency groups updated
  • Add HybridEP configuration options in _apply_moe_config():
    • moe_flex_dispatcher_backend: Flex dispatcher backend (e.g., 'hybridep')
    • moe_hybridep_num_sms: Number of SMs for HybridEP operations

Usage in config:
policy.megatron_cfg.moe_token_dispatcher_type=flex
policy.megatron_cfg.moe_flex_dispatcher_backend=hybridep
policy.megatron_cfg.moe_hybridep_num_sms=32

See: https://github.com/deepseek-ai/DeepEP/tree/hybrid-ep

What does this PR do ?

Add a one line overview of what this PR aims to accomplish.

Issues

List issues that this PR closes (syntax):

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
  • Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

  • ...

Summary by CodeRabbit

  • New Features

    • Added configuration support for additional Mixture of Experts (MoE) model parameters, including dispatcher backend and HybridEP settings.
  • Dependencies

    • Updated DeepEP dependency reference to use the hybrid-ep branch across multiple dependency groups.

- Update DeepEP dependency to hybrid-ep branch for HybridEP support
  - automodel, vllm, mcore dependency groups updated
- Add HybridEP configuration options in _apply_moe_config():
  - moe_flex_dispatcher_backend: Flex dispatcher backend (e.g., 'hybridep')
  - moe_hybridep_num_sms: Number of SMs for HybridEP operations

Usage in config:
  policy.megatron_cfg.moe_token_dispatcher_type=flex
  policy.megatron_cfg.moe_flex_dispatcher_backend=hybridep
  policy.megatron_cfg.moe_hybridep_num_sms=32

See: https://github.com/deepseek-ai/DeepEP/tree/hybrid-ep
@seonjinn seonjinn requested review from a team as code owners February 13, 2026 19:37
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 13, 2026

📝 Walkthrough

Walkthrough

The changes add two conditional configuration hooks to the MoE setup function for optional dispatcher and HybridEP parameters, and update the DeepEP dependency reference from a specific commit to the hybrid-ep branch across multiple dependency groups in the project configuration.

Changes

Cohort / File(s) Summary
MoE Configuration Hooks
nemo_rl/models/megatron/setup.py
Adds conditional assignments for moe_flex_dispatcher_backend and moe_hybridep_num_sms configuration parameters in the _apply_moe_config function.
Dependency Updates
pyproject.toml
Updates DeepEP git dependency reference from commit bfded34800dfec415b71503f8205181de90b2480 to the hybrid-ep branch across automodel, vllm, and mcore dependency groups with explanatory comments.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~5 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 2
❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Merge Conflict Detection ⚠️ Warning ❌ Merge conflicts detected (238 files):

⚔️ .gitmodules (content)
⚔️ 3rdparty/Megatron-Bridge-workspace/Megatron-Bridge (content)
⚔️ 3rdparty/Megatron-Bridge-workspace/setup.py (content)
⚔️ 3rdparty/Megatron-LM-workspace/Megatron-LM (content)
⚔️ 3rdparty/Megatron-LM-workspace/setup.py (content)
⚔️ CODING_GUIDELINES.md (content)
⚔️ README.md (content)
⚔️ docker/Dockerfile (content)
⚔️ docs/about/algorithms/dapo.md (content)
⚔️ docs/about/algorithms/grpo.md (content)
⚔️ docs/about/algorithms/on-policy-distillation.md (content)
⚔️ docs/about/installation.md (content)
⚔️ docs/about/performance-summary.md (content)
⚔️ docs/about/quick-start.md (content)
⚔️ docs/cluster.md (content)
⚔️ docs/conf.py (content)
⚔️ docs/design-docs/dependency-management.md (content)
⚔️ docs/design-docs/fsdp2-parallel-plan.md (content)
⚔️ docs/docker.md (content)
⚔️ docs/guides/dapo.md (content)
⚔️ docs/guides/dpo.md (content)
⚔️ docs/guides/dtensor-tp-accuracy.md (content)
⚔️ docs/guides/environments.md (content)
⚔️ docs/guides/ft-launcher-guide.md (content)
⚔️ docs/guides/grpo-deepscaler.md (content)
⚔️ docs/guides/grpo.md (content)
⚔️ docs/guides/rm.md (content)
⚔️ docs/guides/sft.md (content)
⚔️ docs/guides/use-custom-vllm.md (content)
⚔️ docs/index.md (content)
⚔️ docs/local-workstation.md (content)
⚔️ docs/nsys-profiling.md (content)
⚔️ docs/testing.md (content)
⚔️ docs/versions1.json (content)
⚔️ examples/configs/distillation_math.yaml (content)
⚔️ examples/configs/distillation_math_megatron.yaml (content)
⚔️ examples/configs/dpo.yaml (content)
⚔️ examples/configs/grpo_math_1B.yaml (content)
⚔️ examples/configs/grpo_math_1B_megatron.yaml (content)
⚔️ examples/configs/grpo_math_70B_megatron.yaml (content)
⚔️ examples/configs/recipes/llm/dpo-llama3.1-8b-tulu3-1n8g-fsdp2tp1.yaml (content)
⚔️ examples/configs/recipes/llm/grpo-gemma3-27b-it-8n8g-fsdp2tp8-actckpt-long.yaml (content)
⚔️ examples/configs/recipes/llm/grpo-llama3.1-8b-instruct-2n8g-megatron-fp8-e2e.yaml (content)
⚔️ examples/configs/recipes/llm/grpo-moonlight-16ba3b-4n8g-megatron-fp8-e2e.yaml (content)
⚔️ examples/configs/recipes/llm/grpo-qwen2.5-32b-32n8g-fsdp2tp8-actckpt-long.v3.yaml (content)
⚔️ examples/configs/recipes/llm/grpo-qwen2.5-32b-32n8g-fsdp2tp8-actckpt.v3.yaml (content)
⚔️ examples/configs/recipes/llm/grpo-qwen2.5-7b-instruct-4n8g-fsdp2tp4.v3.yaml (content)
⚔️ examples/configs/recipes/llm/performance/grpo-llama3.1-8b-instruct-2n8g-fp8-async-1off.yaml (content)
⚔️ examples/configs/recipes/llm/sft-nanov3-30BA3B-2n8g-fsdp2-lora.yaml (content)
⚔️ examples/configs/rm.yaml (content)
⚔️ examples/configs/sft.yaml (content)
⚔️ examples/configs/sft_openmathinstruct2.yaml (content)
⚔️ examples/configs/sft_openmathinstruct2_megatron.yaml (content)
⚔️ examples/configs/vlm_grpo_3B.yaml (content)
⚔️ examples/configs/vlm_grpo_3B_megatron.yaml (content)
⚔️ examples/nemo_gym/grpo_workplace_assistant_nemotron_nano_v2_9b.yaml (content)
⚔️ examples/nemo_gym/run_grpo_nemo_gym.py (content)
⚔️ examples/run_dpo.py (content)
⚔️ examples/run_grpo.py (content)
⚔️ examples/run_grpo_sliding_puzzle.py (content)
⚔️ examples/run_rm.py (content)
⚔️ examples/run_sft.py (content)
⚔️ examples/run_vlm_grpo.py (content)
⚔️ nemo_rl/algorithms/distillation.py (content)
⚔️ nemo_rl/algorithms/dpo.py (content)
⚔️ nemo_rl/algorithms/grpo.py (content)
⚔️ nemo_rl/algorithms/loss_functions.py (content)
⚔️ nemo_rl/algorithms/reward_functions.py (content)
⚔️ nemo_rl/algorithms/rm.py (content)
⚔️ nemo_rl/algorithms/sft.py (content)
⚔️ nemo_rl/data/__init__.py (content)
⚔️ nemo_rl/data/collate_fn.py (content)
⚔️ nemo_rl/data/datasets/preference_datasets/__init__.py (content)
⚔️ nemo_rl/data/datasets/preference_datasets/binary_preference_dataset.py (content)
⚔️ nemo_rl/data/datasets/preference_datasets/helpsteer3.py (content)
⚔️ nemo_rl/data/datasets/preference_datasets/preference_dataset.py (content)
⚔️ nemo_rl/data/datasets/preference_datasets/tulu3.py (content)
⚔️ nemo_rl/data/datasets/processed_dataset.py (content)
⚔️ nemo_rl/data/datasets/raw_dataset.py (content)
⚔️ nemo_rl/data/datasets/response_datasets/__init__.py (content)
⚔️ nemo_rl/data/datasets/response_datasets/oai_format_dataset.py (content)
⚔️ nemo_rl/data/datasets/response_datasets/response_dataset.py (content)
⚔️ nemo_rl/data/datasets/utils.py (content)
⚔️ nemo_rl/data/interfaces.py (content)
⚔️ nemo_rl/data/processors.py (content)
⚔️ nemo_rl/data/utils.py (content)
⚔️ nemo_rl/environments/nemo_gym.py (content)
⚔️ nemo_rl/environments/utils.py (content)
⚔️ nemo_rl/experience/rollouts.py (content)
⚔️ nemo_rl/models/automodel/data.py (content)
⚔️ nemo_rl/models/generation/__init__.py (content)
⚔️ nemo_rl/models/generation/interfaces.py (content)
⚔️ nemo_rl/models/generation/vllm/utils.py (content)
⚔️ nemo_rl/models/generation/vllm/vllm_generation.py (content)
⚔️ nemo_rl/models/generation/vllm/vllm_worker.py (content)
⚔️ nemo_rl/models/generation/vllm/vllm_worker_async.py (content)
⚔️ nemo_rl/models/megatron/config.py (content)
⚔️ nemo_rl/models/megatron/data.py (content)
⚔️ nemo_rl/models/megatron/setup.py (content)
⚔️ nemo_rl/models/policy/lm_policy.py (content)
⚔️ nemo_rl/models/policy/workers/dtensor_policy_worker.py (content)
⚔️ nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py (content)
⚔️ nemo_rl/models/policy/workers/megatron_policy_worker.py (content)
⚔️ nemo_rl/utils/checkpoint.py (content)
⚔️ nemo_rl/utils/config.py (content)
⚔️ nemo_rl/utils/logger.py (content)
⚔️ nemo_rl/utils/native_checkpoint.py (content)
⚔️ nemo_rl/utils/prefetch_venvs.py (content)
⚔️ pyproject.toml (content)
⚔️ pyrefly.toml (content)
⚔️ research/template_project/configs/grpo_math_1B.yaml (content)
⚔️ research/template_project/single_update.py (content)
⚔️ tests/check_metrics.py (content)
⚔️ tests/functional/L1_Functional_Tests_GPU.sh (content)
⚔️ tests/functional/distillation.sh (content)
⚔️ tests/functional/distillation_megatron.sh (content)
⚔️ tests/functional/dpo_megatron.sh (content)
⚔️ tests/functional/grpo.sh (content)
⚔️ tests/functional/grpo_async.sh (content)
⚔️ tests/functional/grpo_automodel_lora.sh (content)
⚔️ tests/functional/grpo_automodel_lora_async.sh (content)
⚔️ tests/functional/grpo_automodel_lora_non_colocated.sh (content)
⚔️ tests/functional/grpo_frozen_env.sh (content)
⚔️ tests/functional/grpo_megatron.sh (content)
⚔️ tests/functional/grpo_megatron_generation.sh (content)
⚔️ tests/functional/grpo_non_colocated.sh (content)
⚔️ tests/functional/grpo_rm_env.sh (content)
⚔️ tests/functional/grpo_sglang.sh (content)
⚔️ tests/functional/test_converter_roundtrip.py (content)
⚔️ tests/test_suites/llm/dapo-qwen2.5-7b-16n4g-fsdp2cp2.sh (content)
⚔️ tests/test_suites/llm/dapo-qwen2.5-7b.sh (content)
⚔️ tests/test_suites/llm/distillation-qwen3-32b-to-1.7b-base-1n4g-fsdp2tp1.v1.sh (content)
⚔️ tests/test_suites/llm/distillation-qwen3-32b-to-1.7b-base-1n4g-megatron-tp1pp2cp2-pack.sh (content)
⚔️ tests/test_suites/llm/distillation-qwen3-32b-to-1.7b-base-1n8g-fsdp2tp1.v1.sh (content)
⚔️ tests/test_suites/llm/distillation-qwen3-32b-to-1.7b-base-1n8g-megatron-tp2pp2cp2-pack.sh (content)
⚔️ tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-1n8g-fsdp2tp2-dynamicbatch.v1.sh (content)
⚔️ tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n4g-fsdp2tp1-long.v1.sh (content)
⚔️ tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-long.v1.sh (content)
⚔️ tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-seqpack.v1.sh (content)
⚔️ tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp8-noncolocated.v1.sh (content)
⚔️ tests/test_suites/llm/grpo-dapomath17k-dsv3-32n4g-megatron.sh (content)
⚔️ tests/test_suites/llm/grpo-dapomath17k-dsv3-megatron.sh (content)
⚔️ tests/test_suites/llm/grpo-deepscaler-1.5b-16K.sh (content)
⚔️ tests/test_suites/llm/grpo-deepscaler-1.5b-1n4g-8K.sh (content)
⚔️ tests/test_suites/llm/grpo-deepscaler-1.5b-24K.sh (content)
⚔️ tests/test_suites/llm/grpo-deepscaler-1.5b-8K.sh (content)
⚔️ tests/test_suites/llm/grpo-gemma3-1b-it-1n4g-fsdp2tp1.sh (content)
⚔️ tests/test_suites/llm/grpo-gemma3-1b-it-1n8g-fsdp2tp1.sh (content)
⚔️ tests/test_suites/llm/grpo-gemma3-27b-it-8n4g-fsdp2tp4-actckpt-long.sh (content)
⚔️ tests/test_suites/llm/grpo-gemma3-27b-it-8n8g-fsdp2tp8-actckpt-long.sh (content)
⚔️ tests/test_suites/llm/grpo-gptoss-20b-8n4g-megatron.sh (content)
⚔️ tests/test_suites/llm/grpo-gptoss-20b-8n8g-megatron.sh (content)
⚔️ tests/test_suites/llm/grpo-gspo-deepscaler-1.5b-8K.sh (content)
⚔️ tests/test_suites/llm/grpo-llama3.1-8b-instruct-1n8g-megatron-fp8-rollouts.v3.sh (content)
⚔️ tests/test_suites/llm/grpo-llama3.1-8b-instruct-2n4g-fsdp2tp1-noncolocated.sh (content)
⚔️ tests/test_suites/llm/grpo-llama3.1-8b-instruct-2n8g-fsdp2tp1-noncolocated.sh (content)
⚔️ tests/test_suites/llm/grpo-llama3.1-8b-instruct-2n8g-megatron-fp8-e2e.sh (content)
⚔️ tests/test_suites/llm/grpo-llama3.1-8b-instruct-4n4g-fsdp2tp1-long.v3.sh (content)
⚔️ tests/test_suites/llm/grpo-llama3.1-8b-instruct-4n8g-fsdp2tp1-long.v3.sh (content)
⚔️ tests/test_suites/llm/grpo-llama3.2-1b-instruct-1n4g-fsdp2tp1.v3.sh (content)
⚔️ tests/test_suites/llm/grpo-llama3.2-1b-instruct-1n4g-megatron.sh (content)
⚔️ tests/test_suites/llm/grpo-llama3.2-1b-instruct-1n4g-megatron_generation.sh (content)
⚔️ tests/test_suites/llm/grpo-llama3.2-1b-instruct-1n8g-fsdp2tp1.v3.sh (content)
⚔️ tests/test_suites/llm/grpo-llama3.2-1b-instruct-1n8g-megatron.sh (content)
⚔️ tests/test_suites/llm/grpo-llama3.2-1b-instruct-1n8g-megatron_generation.sh (content)
⚔️ tests/test_suites/llm/grpo-math-llama-nemotron-super-49b-v.5-4n8g-fsdp2tp8.sh.disabled (content)
⚔️ tests/test_suites/llm/grpo-math-qwen3-30ba3b-megatron-tp4-32k.sh (content)
⚔️ tests/test_suites/llm/grpo-moonlight-16b-automodel-1n8g-ep8.sh (content)
⚔️ tests/test_suites/llm/grpo-moonlight-16ba3b-4n4g-megatron.sh (content)
⚔️ tests/test_suites/llm/grpo-moonlight-16ba3b-4n8g-megatron-fp8-e2e.sh (content)
⚔️ tests/test_suites/llm/grpo-moonlight-16ba3b-4n8g-megatron.sh (content)
⚔️ tests/test_suites/llm/grpo-nano-v2-12b-1n4g-megatron.sh (content)
⚔️ tests/test_suites/llm/grpo-nano-v2-12b-1n8g-megatron.sh (content)
⚔️ tests/test_suites/llm/grpo-nano-v2-12b-2n4g-fsdp2tp1.sh (content)
⚔️ tests/test_suites/llm/grpo-nano-v2-12b-2n8g-fsdp2tp1.sh (content)
⚔️ tests/test_suites/llm/grpo-qwen2.5-32b-32n4g-fsdp2tp4-actckpt-long.v3.sh (content)
⚔️ tests/test_suites/llm/grpo-qwen2.5-32b-32n8g-fsdp2tp8-actckpt-long.v3.sh (content)
⚔️ tests/test_suites/llm/grpo-qwen2.5-32b-32n8g-fsdp2tp8-actckpt.v3.sh (content)
⚔️ tests/test_suites/llm/grpo-qwen2.5-7b-instruct-4n4g-fsdp2tp2.v3.sh (content)
⚔️ tests/test_suites/llm/grpo-qwen2.5-7b-instruct-4n4g-megatron.sh (content)
⚔️ tests/test_suites/llm/grpo-qwen2.5-7b-instruct-4n8g-fsdp2tp4.v3.sh (content)
⚔️ tests/test_suites/llm/grpo-qwen2.5-7b-instruct-4n8g-megatron.sh (content)
⚔️ tests/test_suites/llm/grpo-qwen2.5-math-1.5b-instruct-1n4g-fsdp2tp1.v3.sh (content)
⚔️ tests/test_suites/llm/grpo-qwen2.5-math-1.5b-instruct-1n8g-fsdp2tp1-sglang.sh (content)
⚔️ tests/test_suites/llm/grpo-qwen2.5-math-1.5b-instruct-1n8g-fsdp2tp1.v3.sh (content)
⚔️ tests/test_suites/llm/grpo-qwen3-0.6b-1n8g-sglang.sh (content)
⚔️ tests/test_suites/llm/grpo-qwen3-30ba3b-8n4g-megatron.sh (content)
⚔️ tests/test_suites/llm/grpo-qwen3-30ba3b-8n8g-megatron.sh (content)
⚔️ tests/test_suites/llm/grpo-qwen3-8B-base-1n8g-fsdp2-lora.sh (content)
⚔️ tests/test_suites/llm/grpo-qwen3-8b-base-1n8g-fp8-kvcache-megatron.sh (content)
⚔️ tests/test_suites/llm/performance/dapo-deepseek-v3-64n8g.sh (content)
⚔️ tests/test_suites/llm/performance/grpo-deepseek-v3-32n4g.sh (content)
⚔️ tests/test_suites/llm/performance/grpo-deepseek-v3-32n8g.sh (content)
⚔️ tests/test_suites/llm/performance/grpo-deepseek-v3-64n4g-async-1off.sh (content)
⚔️ tests/test_suites/llm/performance/grpo-deepseek-v3-64n8g-async-1off.sh (content)
⚔️ tests/test_suites/llm/performance/grpo-deepseek-v3-64n8g-fp8-async-1off.sh (content)
⚔️ tests/test_suites/llm/performance/grpo-llama3.1-8b-instruct-2n4g-async-1off.sh (content)
⚔️ tests/test_suites/llm/performance/grpo-llama3.1-8b-instruct-2n4g.sh (content)
⚔️ tests/test_suites/llm/performance/grpo-llama3.1-8b-instruct-2n8g-async-1off.sh (content)
⚔️ tests/test_suites/llm/performance/grpo-llama3.1-8b-instruct-2n8g-fp8-async-1off.sh (content)
⚔️ tests/test_suites/llm/performance/grpo-llama3.1-8b-instruct-2n8g.sh (content)
⚔️ tests/test_suites/llm/performance/grpo-qwen3-235b-16n4g.sh (content)
⚔️ tests/test_suites/llm/performance/grpo-qwen3-235b-16n8g.sh (content)
⚔️ tests/test_suites/llm/performance/grpo-qwen3-235b-32n4g-async-1off.sh (content)
⚔️ tests/test_suites/llm/performance/grpo-qwen3-235b-32n8g-async-1off.sh (content)
⚔️ tests/test_suites/llm/performance/grpo-qwen3-30ba3b-24n8g-async-8off.sh (content)
⚔️ tests/test_suites/llm/performance/grpo-qwen3-30ba3b-4n4g.sh (content)
⚔️ tests/test_suites/llm/performance/grpo-qwen3-30ba3b-4n8g-40K.sh (content)
⚔️ tests/test_suites/llm/performance/grpo-qwen3-30ba3b-4n8g-async-1off.sh (content)
⚔️ tests/test_suites/llm/performance/grpo-qwen3-30ba3b-4n8g.sh (content)
⚔️ tests/test_suites/llm/performance/grpo-qwen3-30ba3b-8n4g-async-1off.sh (content)
⚔️ tests/test_suites/llm/performance/grpo-qwen3-32b-4n4g.sh (content)
⚔️ tests/test_suites/llm/performance/grpo-qwen3-32b-4n8g.sh (content)
⚔️ tests/test_suites/llm/performance/grpo-qwen3-32b-8n4g-async-1off.sh (content)
⚔️ tests/test_suites/llm/performance/grpo-qwen3-32b-8n8g-async-1off.sh (content)
⚔️ tests/test_suites/nightly.txt (content)
⚔️ tests/unit/algorithms/test_distillation.py (content)
⚔️ tests/unit/algorithms/test_dpo.py (content)
⚔️ tests/unit/algorithms/test_grpo.py (content)
⚔️ tests/unit/algorithms/test_loss_functions.py (content)
⚔️ tests/unit/algorithms/test_reward_functions.py (content)
⚔️ tests/unit/algorithms/test_rm.py (content)
⚔️ tests/unit/algorithms/test_sft.py (content)
⚔️ tests/unit/data/datasets/test_preference_dataset.py (content)
⚔️ tests/unit/data/datasets/test_response_dataset.py (content)
⚔️ tests/unit/experience/test_rollouts.py (content)
⚔️ tests/unit/models/generation/test_vllm_utils.py (content)
⚔️ tests/unit/models/megatron/test_megatron_data.py (content)
⚔️ tests/unit/models/megatron/test_megatron_setup.py (content)
⚔️ tests/unit/models/policy/test_megatron_worker.py (content)
⚔️ tests/unit/test_config_validation.py (content)
⚔️ tests/unit/test_recipes_and_test_suites.py (content)
⚔️ tests/unit/utils/test_checkpoint.py (content)
⚔️ tests/unit/utils/test_logger.py (content)
⚔️ tests/unit/utils/test_native_checkpoint.py (content)
⚔️ tools/config_cli.py (content)
⚔️ tools/launch (content)
⚔️ uv.lock (content)

These conflicts must be resolved before merging into main.
Resolve conflicts locally and push changes to this branch.
Test Results For Major Changes ⚠️ Warning PR adds major HybridEP feature for MoE expert parallelism with dependency updates across multiple groups, but PR description lacks test results, validation evidence, or regression testing information. Add test results demonstrating HybridEP configurations work correctly, existing MoE functionality remains unaffected, training convergence is not impacted, and include performance metrics or links to test runs.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding HybridEP support for MoE expert parallelism. It directly reflects the core objective of updating DeepEP dependency and adding new MoE configuration options.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch sj/hybridep-support
⚔️ Resolve merge conflicts (beta)
  • Auto-commit resolved conflicts to branch sj/hybridep-support
  • Create stacked PR with resolved conflicts
  • Post resolved changes as copyable diffs in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pyproject.toml (1)

324-327: ⚠️ Potential issue | 🟠 Major

Stale dependency-metadata version for deep_ep.

The dependency is pinned to the hybrid-ep branch (which dynamically generates its version from the current commit hash via git rev-parse --short HEAD), but the dependency-metadata version is statically set to v1.2.1+bfded34. This means the metadata version will become stale whenever the branch advances, potentially causing uv resolver failures.

Either:

  1. Update to pin to a specific commit hash instead of a branch, or
  2. Update the metadata version to match the current HEAD of hybrid-ep and regenerate it whenever the dependency updates
🤖 Fix all issues with AI agents
In `@nemo_rl/models/megatron/setup.py`:
- Around line 405-412: The new runtime keys moe_flex_dispatcher_backend and
moe_hybridep_num_sms are missing from the MegatronConfig TypedDict and from
example configs; add both to the MegatronConfig definition in
nemo_rl/models/policy/__init__.py as NotRequired entries (use the exact symbol
name MegatronConfig) with short docstrings: "Backend type for MoE flex
dispatcher (HybridEP)" for moe_flex_dispatcher_backend and "Number of SMs for
HybridEP" for moe_hybridep_num_sms, and then update at least one exemplar YAML
in examples/configs (e.g., a megatron MoE config) to include these keys with
sensible defaults (recommended defaults) so they are documented and visible to
users.
🧹 Nitpick comments (1)
pyproject.toml (1)

70-72: Branch ref instead of pinned commit reduces build reproducibility.

All three dependency groups now point to @hybrid-ep (a moving branch) instead of a fixed commit hash. This means builds are not reproducible — a force-push or new commit on that branch silently changes what gets installed. Consider pinning to a specific commit on the hybrid-ep branch once it stabilizes.

Comment on lines +405 to +412
# HybridEP settings for MoE expert parallelism
# See: https://github.com/deepseek-ai/DeepEP/tree/hybrid-ep
if "moe_flex_dispatcher_backend" in config["megatron_cfg"]:
model_cfg.moe_flex_dispatcher_backend = config["megatron_cfg"][
"moe_flex_dispatcher_backend"
]
if "moe_hybridep_num_sms" in config["megatron_cfg"]:
model_cfg.moe_hybridep_num_sms = config["megatron_cfg"]["moe_hybridep_num_sms"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's understand the file structure and find the TypedDict definition
find . -type f -name "*.py" -path "*/nemo_rl/*" | head -20

Repository: NVIDIA-NeMo/RL

Length of output: 805


🏁 Script executed:

# Search for megatron_cfg TypedDict definition and the new keys
rg -n "moe_flex_dispatcher_backend|moe_hybridep_num_sms" --type=py

Repository: NVIDIA-NeMo/RL

Length of output: 565


🏁 Script executed:

# Check for exemplar YAMLs in examples/configs
fd -e yaml -p "examples/configs" | head -20

Repository: NVIDIA-NeMo/RL

Length of output: 856


🏁 Script executed:

# Search for megatron_cfg TypedDict definition
rg -n "class.*megatron.*cfg|MegatronCfg" --type=py -i | head -30

Repository: NVIDIA-NeMo/RL

Length of output: 40


🏁 Script executed:

# Search for TypedDict definitions in megatron module or related config files
rg -n "TypedDict|class.*Cfg" --type=py nemo_rl/models/megatron/ | head -40

Repository: NVIDIA-NeMo/RL

Length of output: 240


🏁 Script executed:

# Check the setup.py file around the lines in question for context
sed -n '380,420p' nemo_rl/models/megatron/setup.py

Repository: NVIDIA-NeMo/RL

Length of output: 1825


🏁 Script executed:

# Check one of the megatron exemplar YAMLs to see if megatron_cfg is documented
cat examples/configs/grpo_math_70B_megatron.yaml | head -100

Repository: NVIDIA-NeMo/RL

Length of output: 1826


🏁 Script executed:

# Search for where megatron_cfg is defined as a TypedDict
rg -n "megatron_cfg.*:" --type=py | head -20

Repository: NVIDIA-NeMo/RL

Length of output: 1855


🏁 Script executed:

# Look at the config.py file to find MegatronConfig definition
cat nemo_rl/models/megatron/config.py

Repository: NVIDIA-NeMo/RL

Length of output: 3082


🏁 Script executed:

# Also check the policy/__init__.py to see the full PolicyConfig TypedDict
sed -n '1,300p' nemo_rl/models/policy/__init__.py | grep -A 20 "megatron_cfg"

Repository: NVIDIA-NeMo/RL

Length of output: 828


🏁 Script executed:

# Search for MegatronConfig class/TypedDict definition
rg -n "class MegatronConfig|MegatronConfig.*=.*TypedDict" --type=py

Repository: NVIDIA-NeMo/RL

Length of output: 207


🏁 Script executed:

# Check the entire policy/__init__.py file
wc -l nemo_rl/models/policy/__init__.py

Repository: NVIDIA-NeMo/RL

Length of output: 95


🏁 Script executed:

# Get the full PolicyConfig definition including imports
head -300 nemo_rl/models/policy/__init__.py

Repository: NVIDIA-NeMo/RL

Length of output: 10006


Add moe_flex_dispatcher_backend and moe_hybridep_num_sms to MegatronConfig TypedDict and exemplar YAMLs.

The two new optional keys are correctly handled at runtime with in checks, but per coding guidelines, new config keys must be documented in the TypedDict subclass and reflected in exemplar YAMLs.

Add both keys to MegatronConfig in nemo_rl/models/policy/__init__.py using NotRequired, with docstrings explaining their purpose and valid values (e.g., "Backend type for MoE flex dispatcher (HybridEP)" and "Number of SMs for HybridEP"). Update at least one exemplar YAML under examples/configs/ (e.g., a megatron MoE config) to include these keys with their recommended defaults.

🤖 Prompt for AI Agents
In `@nemo_rl/models/megatron/setup.py` around lines 405 - 412, The new runtime
keys moe_flex_dispatcher_backend and moe_hybridep_num_sms are missing from the
MegatronConfig TypedDict and from example configs; add both to the
MegatronConfig definition in nemo_rl/models/policy/__init__.py as NotRequired
entries (use the exact symbol name MegatronConfig) with short docstrings:
"Backend type for MoE flex dispatcher (HybridEP)" for
moe_flex_dispatcher_backend and "Number of SMs for HybridEP" for
moe_hybridep_num_sms, and then update at least one exemplar YAML in
examples/configs (e.g., a megatron MoE config) to include these keys with
sensible defaults (recommended defaults) so they are documented and visible to
users.

@seonjinn seonjinn requested a review from guyueh1 February 13, 2026 19:42
seonjinn and others added 3 commits February 13, 2026 15:25
…odels

Add performance recipes and test scripts for HybridEP and CUDA Graph optimizations:
- grpo-qwen3-30ba3b-4n4g-hybridep: HybridEP with flex dispatcher for Qwen3-30B-A3B
- grpo-qwen3-30ba3b-4n4g-hybridep-cudagraph: HybridEP + CUDA Graph (attn, moe_router)
- grpo-qwen3-235b-16n4g-hybridep: HybridEP for Qwen3-235B-A22B
- grpo-qwen3-235b-16n4g-hybridep-cudagraph: HybridEP + CUDA Graph for Qwen3-235B-A22B

Key configurations:
- moe_token_dispatcher_type: flex
- moe_flex_dispatcher_backend: hybridep
- moe_hybridep_num_sms: 32
- cuda_graph_impl: transformer_engine
- cuda_graph_scope: [attn, moe_router]
…on handling

- grpo-qwen3-30ba3b-4n4g-hybridep.yaml: sequence_packing.enabled=false (compat note)
- ray.sub: CUDA_HOME/PATH for nvcc, attach shell single-quote fix for uv
- common.env: exit_if_max_steps_reached handles missing metrics.json
- test scripts: metrics.json existence check before jq

Co-authored-by: Cursor <cursoragent@cursor.com>
@seonjinn seonjinn requested review from a team as code owners February 13, 2026 23:26
"deep_ep @ git+https://github.com/deepseek-ai/DeepEP.git@bfded34800dfec415b71503f8205181de90b2480",
# HybridEP branch for MoE expert parallelism
# See: https://github.com/deepseek-ai/DeepEP/tree/hybrid-ep
"deep_ep @ git+https://github.com/deepseek-ai/DeepEP.git@hybrid-ep",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"deep_ep @ git+https://github.com/deepseek-ai/DeepEP.git@hybrid-ep",
"deep_ep @ git+https://github.com/deepseek-ai/DeepEP.git@bfded34800dfec415b71503f8205181de90b2480 ; platform_machine == 'x86_64'",
"deep_ep @ git+https://github.com/deepseek-ai/DeepEP.git@hybrid-ep ; platform_machine == 'aarch64'",

"deep_ep @ git+https://github.com/deepseek-ai/DeepEP.git@bfded34800dfec415b71503f8205181de90b2480",
# HybridEP branch for MoE expert parallelism
# See: https://github.com/deepseek-ai/DeepEP/tree/hybrid-ep
"deep_ep @ git+https://github.com/deepseek-ai/DeepEP.git@hybrid-ep",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"deep_ep @ git+https://github.com/deepseek-ai/DeepEP.git@hybrid-ep",
"deep_ep @ git+https://github.com/deepseek-ai/DeepEP.git@bfded34800dfec415b71503f8205181de90b2480 ; platform_machine == 'x86_64'",
"deep_ep @ git+https://github.com/deepseek-ai/DeepEP.git@hybrid-ep ; platform_machine == 'aarch64'",

"deep_ep @ git+https://github.com/deepseek-ai/DeepEP.git@bfded34800dfec415b71503f8205181de90b2480",
# HybridEP branch for MoE expert parallelism
# See: https://github.com/deepseek-ai/DeepEP/tree/hybrid-ep
"deep_ep @ git+https://github.com/deepseek-ai/DeepEP.git@hybrid-ep",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"deep_ep @ git+https://github.com/deepseek-ai/DeepEP.git@hybrid-ep",
"deep_ep @ git+https://github.com/deepseek-ai/DeepEP.git@bfded34800dfec415b71503f8205181de90b2480 ; platform_machine == 'x86_64'",
"deep_ep @ git+https://github.com/deepseek-ai/DeepEP.git@hybrid-ep ; platform_machine == 'aarch64'",

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants