Skip to content

fix: speedup minimize and minimize-check in config_cli#1964

Merged
terrykong merged 2 commits intomainfrom
hemil/speedup-minimize-check
Feb 17, 2026
Merged

fix: speedup minimize and minimize-check in config_cli#1964
terrykong merged 2 commits intomainfrom
hemil/speedup-minimize-check

Conversation

@hemildesai
Copy link
Contributor

@hemildesai hemildesai commented Feb 15, 2026

The minimize-check llm recipes pre-commit hook was invoking ./tools/config_cli.py minimize-check once per recipe file in a bash for-loop. With ~100 recipe YAML files, this spawned 98 separate uv run --script processes, each paying the full Python/uv startup and omegaconf import cost.

Changes:

  • tools/config_cli.py: Both minimize and minimize-check now accept multiple config paths (nargs='+') in a single invocation. Expanded base configs
    are cached across files, so shared parents (e.g., dpo.yaml) are only loaded once.
  • .pre-commit-config.yaml: Replaced the bash for loop with a single invocation that passes all files via glob expansion.

Summary by CodeRabbit

  • New Features
    • Config minimize and minimize-check commands now accept multiple configuration files in a single invocation, eliminating the need for separate runs per file.
    • Pre-commit hooks optimized to batch process configuration files more efficiently.

Signed-off-by: Hemil Desai <hemild@nvidia.com>
@hemildesai hemildesai requested review from a team as code owners February 15, 2026 19:26
Signed-off-by: Hemil Desai <hemild@nvidia.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 15, 2026

📝 Walkthrough

Walkthrough

The PR refactors config_cli.py to support batch processing of multiple configuration files in a single invocation. The minimize and minimize_check commands now accept multiple configs via glob patterns, with added helper functions to process individual files and shared base config caching. Tests and pre-commit hooks are updated accordingly.

Changes

Cohort / File(s) Summary
Config CLI Multi-Config Support
tools/config_cli.py
Added _minimize_one() and _minimize_check_one() helper functions for single-file processing. Updated minimize() and minimize_check() to iterate over args.configs (list) instead of single config, with shared base_config_cache to avoid redundant base config loads. CLI argument parsing changed from single config to nargs="+" for multiple files.
Test Namespace Updates
tests/unit/tools/test_config_cli.py
Updated test namespace construction to use "configs" attribute containing a list instead of single "config" attribute across multiple test cases.
Pre-commit Hook Simplification
.pre-commit-config.yaml
Simplified minimize-check invocation to pass glob pattern directly as a single argument instead of looping over individual files with existence checks.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 2
❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Merge Conflict Detection ⚠️ Warning ❌ Merge conflicts detected (16 files):

⚔️ .pre-commit-config.yaml (content)
⚔️ docker/Dockerfile (content)
⚔️ docs/guides/use-custom-vllm.md (content)
⚔️ examples/configs/grpo_math_1B.yaml (content)
⚔️ examples/configs/vlm_grpo_3B.yaml (content)
⚔️ examples/configs/vlm_grpo_3B_megatron.yaml (content)
⚔️ examples/nemo_gym/grpo_workplace_assistant_nemotron_nano_v2_9b.yaml (content)
⚔️ nemo_rl/algorithms/grpo.py (content)
⚔️ nemo_rl/environments/nemo_gym.py (content)
⚔️ nemo_rl/models/generation/vllm/vllm_worker.py (content)
⚔️ tests/functional/grpo_non_colocated.sh (content)
⚔️ tests/unit/algorithms/test_grpo.py (content)
⚔️ tests/unit/environments/test_nemo_gym.py (content)
⚔️ tests/unit/tools/test_config_cli.py (content)
⚔️ tools/build-custom-vllm.sh (content)
⚔️ tools/config_cli.py (content)

These conflicts must be resolved before merging into main.
Resolve conflicts locally and push changes to this branch.
Test Results For Major Changes ⚠️ Warning PR claims performance improvement but provides no before-and-after metrics, timing comparisons, or performance validation evidence. Add concrete performance metrics to PR description including execution times, test environment details, and quantified speedup percentage, or include performance test cases.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: optimizing minimize and minimize-check by enabling batch processing of multiple config files in a single invocation, eliminating the overhead of repeated process spawns.
Docstring Coverage ✅ Passed Docstring coverage is 83.33% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch hemil/speedup-minimize-check
⚔️ Resolve merge conflicts (beta)
  • Auto-commit resolved conflicts to branch hemil/speedup-minimize-check
  • Create stacked PR with resolved conflicts
  • Post resolved changes as copyable diffs in a comment

Tip

Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@tools/config_cli.py`:
- Around line 370-389: The code loads bases asymmetrically: when base_override
is set it uses OmegaConf.load(base_path) (no inheritance expansion) but when
inferred it uses load_config(...) (fully expanded), and the shared
base_config_cache can return the wrong shape; change the explicit---base branch
to use the same expansion path as the inferred branch (call
load_config(str(base_path)) or otherwise expand defaults) before caching so
base_config_cache always stores a fully-resolved container; update references to
base_cfg_raw/base_resolved and keep using the same cache key (base_path) so
lookup behavior is consistent for both the base_override branch and the
_infer_base_from_defaults branch.
🧹 Nitpick comments (1)
tools/config_cli.py (1)

350-422: Significant code duplication between _minimize_one and _minimize_check_one.

Both helpers share nearly identical logic for: loading the child config, resolving the base (explicit vs. inferred), cache lookup/population, pruning, and reconstructing the defaults key. The main difference is that one writes/prints the result while the other compares it against the current file.

Consider extracting a shared helper (e.g., _resolve_and_prune(child_path, base_override, base_config_cache) -> tuple[dict, bool, Path]) that returns the pruned config, whether the base was inferred, and the base path. Both _minimize_one and _minimize_check_one would then only contain their divergent "emit vs. compare" logic.

Also applies to: 500-591

@hemildesai hemildesai added the CI:L0 Run doctests and unit tests label Feb 15, 2026
@terrykong terrykong merged commit bb4825a into main Feb 17, 2026
44 of 45 checks passed
@terrykong terrykong deleted the hemil/speedup-minimize-check branch February 17, 2026 18:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI:L0 Run doctests and unit tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments