test: Add script for nemotron test by guyueh1 · Pull Request #1901 · NVIDIA-NeMo/RL

guyueh1 · 2026-02-09T16:52:50Z

What does this PR do ?

As title

Issues

List issues that this PR closes (syntax):

Usage

You can potentially add a usage example below

# Add a code snippet demonstrating how to use this

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

...

Summary by CodeRabbit

New Features
- Added configuration templates for GRPO-based LLM experimentation with multiple hardware topologies and precision modes (standard and mixed-precision FP8).
- Included test suites to run GRPO experiments with logging, monitoring, and checkpointing capabilities.
- Expanded testing coverage for large-scale distributed training scenarios.

Signed-off-by: Guyue Huang <guyueh@nvidia.com>

coderabbitai · 2026-02-09T17:14:16Z

📝 Walkthrough

Walkthrough

This PR adds comprehensive GRPO (Group Relative Policy Optimization) configuration files for multiple Nemotron model hardware configurations (super-8n8g, super-8n4g, super-16n4g) with FP8 quantization and gym environment variants, along with corresponding test execution scripts and test suite manifest updates.

Changes

Cohort / File(s)	Summary
GRPO Base Configuration (8n8g) `examples/configs/recipes/llm/grpo-nemotron-super-8n8g.yaml`	Comprehensive GRPO configuration with policy (Megatron parallelism, FP8 settings), generation (vLLM backend), checkpointing, and cluster specs (8 nodes, 8 GPUs/node). Defines hyperparameters for async GRPO, loss functions, optimization, and logging.
GRPO FP8 Variants `examples/configs/recipes/llm/grpo-nemotron-super-8n8g-mxfp8.yaml`, `examples/configs/recipes/llm/grpo-nemotron-super-16n4g-mxfp8.sh`	Configuration overlays enabling FP8 quantization (MX format) for policies and vLLM generation precision, with WandB run naming.
GRPO Gym Configuration `examples/configs/recipes/llm/grpo-nemotron-super-8n8g-gym.yaml`	Extended GRPO setup for gym environments with NL2Bash judge model, vLLM router settings, and comprehensive training/evaluation infrastructure (411 lines).
GRPO Scaled Configurations `examples/configs/recipes/llm/grpo-nemotron-super-8n4g-gym.yaml`, `examples/configs/recipes/llm/grpo-nemotron-super-16n4g.yaml`	Configuration variants referencing base configs with adjusted cluster topology (8n4g with 16 nodes/4 GPUs, 16n4g with 16 nodes/4 GPUs) and colocated resource policies.
Test Execution Scripts `tests/test_suites/llm/grpo-nemotron-super-*.sh`	Shell scripts for executing GRPO experiments with specific configurations, including environment setup, step/node configuration, WandB/TensorBoard logging, checkpointing, and TensorBoard-to-JSON metrics conversion.
Test Suite Manifests `tests/test_suites/super_b200.txt`, `tests/test_suites/super_gb200.txt`	Updates to test suite registries adding references to new GRPO test scripts for B200 (3 scripts) and GB200 (2 scripts) hardware configurations.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~15 minutes

Possibly related PRs

feat: FP8 Training in Megatron Path #971: Adds/modifies FP8 quantization configuration (fp8\_cfg and vllm precision settings) in similar GRPO config files.
ci: Add nightly and release tests for gb200 #1788: Adds GB200-oriented LLM recipe configs and corresponding test-suite scripts with cluster topology adjustments and manifest updates.

Suggested labels

CI:L1, Run CICD

Suggested reviewers

chtruong814
terrykong
parthchadha

🚥 Pre-merge checks | ✅ 2 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Test Results For Major Changes	⚠️ Warning	PR adds 620+ lines of experimental FP8 and distributed training configurations without documented test results, numerical validation, or performance metrics in the description.	Add test execution results, convergence validation (loss/reward curves), performance metrics (throughput/iteration time), and technical rationale for chosen Nemotron configurations.
Title check	❓ Inconclusive	The title is overly vague and generic. It uses non-descriptive terms like 'Add script' and 'nemotron test' without clarifying what specific functionality, configuration, or test scenario is being added.	Provide a more specific title that describes the main change, such as 'test: Add GRPO Nemotron configuration and test scripts for various hardware setups' or 'test: Add GRPO Nemotron super test scripts for B200/GB200 clusters'.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch guyueh/nemotron_test

No actionable comments were generated in the recent review. 🎉

🧹 Recent nitpick comments

tests/test_suites/llm/grpo-nemotron-super-8n8g-mxfp8.sh (1)

1-35: Script body is identical to grpo-nemotron-super-8n8g.sh.

The only differentiator is the filename (which common.env presumably uses to resolve CONFIG_PATH). If the repo accumulates many such identical scripts, consider a parameterized wrapper that accepts the config variant as an argument. That said, this matches the existing convention.
examples/configs/recipes/llm/grpo-nemotron-super-8n8g-gym.yaml (2)
1-411: Standalone config with no defaults: inheritance — intentional?

Unlike the other YAML configs in this PR (e.g., grpo-nemotron-super-8n8g-mxfp8.yaml which extends grpo-nemotron-super-8n8g.yaml), this ~410-line gym config is fully self-contained. Shared settings like policy.megatron_cfg, loss_fn, and grpo hyperparameters are duplicated rather than inherited, which means updates to the base config won't propagate here.

If this divergence is intentional (different parallelism, 20 vs 8 nodes, gym-specific env/data sections), a brief comment at the top of the file explaining why it doesn't layer on top of the base config would help future maintainers.

244-245: Remove commented-out code.
# compilation_config:
# mode: 0
Leftover commented-out configuration should either be removed or converted into a proper disabled block (like compilation_config: null) to avoid confusion.

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Signed-off-by: Guyue Huang <140554423+guyueh1@users.noreply.github.com>

guyueh1 added 2 commits February 9, 2026 16:50

Add tests for super

0dee06d

Signed-off-by: Guyue Huang <guyueh@nvidia.com>

Fix

f07fa31

Signed-off-by: Guyue Huang <guyueh@nvidia.com>

guyueh1 requested review from a team as code owners February 9, 2026 16:52

guyueh1 added the super-v3 label Feb 9, 2026

terrykong added super-v3 and removed super-v3 labels Feb 10, 2026

terrykong assigned guyueh1 Feb 10, 2026

Merge branch 'main' into guyueh/nemotron_test

a8469d5

guyueh1 added the CI:L0 Run doctests and unit tests label Feb 12, 2026

guyueh1 temporarily deployed to nemo-ci February 12, 2026 22:34 — with GitHub Actions Inactive

guyueh1 temporarily deployed to nemo-ci February 12, 2026 23:18 — with GitHub Actions Inactive

Update grpo-nemotron-super-8n8g-gym.sh

b5c9a96

Signed-off-by: Guyue Huang <140554423+guyueh1@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: Add script for nemotron test#1901

test: Add script for nemotron test#1901
guyueh1 wants to merge 4 commits intomainfrom
guyueh/nemotron_test

guyueh1 commented Feb 9, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 9, 2026

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Review ran into problems

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

guyueh1 commented Feb 9, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do ?

Issues

Usage

Before your PR is "Ready for review"

Additional Information

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 9, 2026

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Review ran into problems

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

guyueh1 commented Feb 9, 2026 •

edited by coderabbitai bot

Loading