Skip to content

test: Add script for nemotron test#1901

Open
guyueh1 wants to merge 4 commits intomainfrom
guyueh/nemotron_test
Open

test: Add script for nemotron test#1901
guyueh1 wants to merge 4 commits intomainfrom
guyueh/nemotron_test

Conversation

@guyueh1
Copy link
Contributor

@guyueh1 guyueh1 commented Feb 9, 2026

What does this PR do ?

As title

Issues

List issues that this PR closes (syntax):

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
  • Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

  • ...

Summary by CodeRabbit

  • New Features
    • Added configuration templates for GRPO-based LLM experimentation with multiple hardware topologies and precision modes (standard and mixed-precision FP8).
    • Included test suites to run GRPO experiments with logging, monitoring, and checkpointing capabilities.
    • Expanded testing coverage for large-scale distributed training scenarios.

Signed-off-by: Guyue Huang <guyueh@nvidia.com>
Signed-off-by: Guyue Huang <guyueh@nvidia.com>
@guyueh1 guyueh1 requested review from a team as code owners February 9, 2026 16:52
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 9, 2026

📝 Walkthrough

Walkthrough

This PR adds comprehensive GRPO (Group Relative Policy Optimization) configuration files for multiple Nemotron model hardware configurations (super-8n8g, super-8n4g, super-16n4g) with FP8 quantization and gym environment variants, along with corresponding test execution scripts and test suite manifest updates.

Changes

Cohort / File(s) Summary
GRPO Base Configuration (8n8g)
examples/configs/recipes/llm/grpo-nemotron-super-8n8g.yaml
Comprehensive GRPO configuration with policy (Megatron parallelism, FP8 settings), generation (vLLM backend), checkpointing, and cluster specs (8 nodes, 8 GPUs/node). Defines hyperparameters for async GRPO, loss functions, optimization, and logging.
GRPO FP8 Variants
examples/configs/recipes/llm/grpo-nemotron-super-8n8g-mxfp8.yaml, examples/configs/recipes/llm/grpo-nemotron-super-16n4g-mxfp8.sh
Configuration overlays enabling FP8 quantization (MX format) for policies and vLLM generation precision, with WandB run naming.
GRPO Gym Configuration
examples/configs/recipes/llm/grpo-nemotron-super-8n8g-gym.yaml
Extended GRPO setup for gym environments with NL2Bash judge model, vLLM router settings, and comprehensive training/evaluation infrastructure (411 lines).
GRPO Scaled Configurations
examples/configs/recipes/llm/grpo-nemotron-super-8n4g-gym.yaml, examples/configs/recipes/llm/grpo-nemotron-super-16n4g.yaml
Configuration variants referencing base configs with adjusted cluster topology (8n4g with 16 nodes/4 GPUs, 16n4g with 16 nodes/4 GPUs) and colocated resource policies.
Test Execution Scripts
tests/test_suites/llm/grpo-nemotron-super-*.sh
Shell scripts for executing GRPO experiments with specific configurations, including environment setup, step/node configuration, WandB/TensorBoard logging, checkpointing, and TensorBoard-to-JSON metrics conversion.
Test Suite Manifests
tests/test_suites/super_b200.txt, tests/test_suites/super_gb200.txt
Updates to test suite registries adding references to new GRPO test scripts for B200 (3 scripts) and GB200 (2 scripts) hardware configurations.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~15 minutes

Possibly related PRs

Suggested labels

CI:L1, Run CICD

Suggested reviewers

  • chtruong814
  • terrykong
  • parthchadha
🚥 Pre-merge checks | ✅ 2 | ❌ 2
❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Test Results For Major Changes ⚠️ Warning PR adds 620+ lines of experimental FP8 and distributed training configurations without documented test results, numerical validation, or performance metrics in the description. Add test execution results, convergence validation (loss/reward curves), performance metrics (throughput/iteration time), and technical rationale for chosen Nemotron configurations.
Title check ❓ Inconclusive The title is overly vague and generic. It uses non-descriptive terms like 'Add script' and 'nemotron test' without clarifying what specific functionality, configuration, or test scenario is being added. Provide a more specific title that describes the main change, such as 'test: Add GRPO Nemotron configuration and test scripts for various hardware setups' or 'test: Add GRPO Nemotron super test scripts for B200/GB200 clusters'.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch guyueh/nemotron_test

No actionable comments were generated in the recent review. 🎉

🧹 Recent nitpick comments
tests/test_suites/llm/grpo-nemotron-super-8n8g-mxfp8.sh (1)

1-35: Script body is identical to grpo-nemotron-super-8n8g.sh.

The only differentiator is the filename (which common.env presumably uses to resolve CONFIG_PATH). If the repo accumulates many such identical scripts, consider a parameterized wrapper that accepts the config variant as an argument. That said, this matches the existing convention.

examples/configs/recipes/llm/grpo-nemotron-super-8n8g-gym.yaml (2)

1-411: Standalone config with no defaults: inheritance — intentional?

Unlike the other YAML configs in this PR (e.g., grpo-nemotron-super-8n8g-mxfp8.yaml which extends grpo-nemotron-super-8n8g.yaml), this ~410-line gym config is fully self-contained. Shared settings like policy.megatron_cfg, loss_fn, and grpo hyperparameters are duplicated rather than inherited, which means updates to the base config won't propagate here.

If this divergence is intentional (different parallelism, 20 vs 8 nodes, gym-specific env/data sections), a brief comment at the top of the file explaining why it doesn't layer on top of the base config would help future maintainers.


244-245: Remove commented-out code.

# compilation_config:
# mode: 0

Leftover commented-out configuration should either be removed or converted into a proper disabled block (like compilation_config: null) to avoid confusion.

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@guyueh1 guyueh1 added the CI:L0 Run doctests and unit tests label Feb 12, 2026
Signed-off-by: Guyue Huang <140554423+guyueh1@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI:L0 Run doctests and unit tests super-v3

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants