Integrating SWE-Pro (Public) Dataset Eval by wasiahmad · Pull Request #1197 · NVIDIA-NeMo/Skills

wasiahmad · 2026-01-28T23:52:25Z

Summary by CodeRabbit

Release Notes

New Features
- Added SWE-bench Pro dataset integration with automated data preparation and normalization pipeline.
- Included default configuration constants for evaluation metrics, dataset grouping, and generation workflows.
- Enhanced dataset records with Docker container metadata and dataset identification fields.
- Provided command-line configurable parameters for customizing dataset setup and container handling.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Signed-off-by: wasiahmad <wasiahmad@ucla.edu>

greptile-apps

_{1 file reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

nemo_skills/dataset/swe-bench-pro/prepare.py

coderabbitai · 2026-01-28T23:56:53Z

📝 Walkthrough

Walkthrough

Introduces a new SWE-bench Pro dataset package with an __init__.py module containing five evaluation configuration constants and a prepare.py script that loads the dataset, normalizes fields, augments records with container metadata and identifiers, then exports to JSONL format.

Changes

Cohort / File(s)	Summary
SWE-bench Pro Dataset Package `nemo_skills/dataset/swe-bench-pro/__init__.py`, `nemo_skills/dataset/swe-bench-pro/prepare.py`	Added module-level configuration constants (EVAL_SPLIT, DATASET_GROUP, METRICS_TYPE, GENERATION_MODULE, GENERATION_ARGS) and implemented `get_dockerhub_image_uri()` function for constructing Docker image tags. Main script loads dataset, normalizes `repo_language` to `language`, computes container formatter from row data, appends container_id, dataset_name, and split metadata, then writes enriched records to JSONL.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~15 minutes

Possibly related PRs

Add apex-shortlist dataset #1080: Introduces a similar dataset package structure with matching module-level evaluation constants and a corresponding prepare.py script for JSONL dataset export.

Suggested reviewers

gwarmstrong
ludwig-n

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Integrating SWE-Pro (Public) Dataset Eval' directly relates to the main changes: adding configuration and preparation files for the SWE-bench Pro dataset integration.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@nemo_skills/dataset/swe-bench-pro/prepare.py`:
- Around line 24-26: The get_dockerhub_image_uri function currently uses
repo_name.lower().split("/") which will fail if repo_name defaults to "" — make
repo_name required (remove the default "") or validate and raise a clear error
if it's empty/doesn't contain a slash; also make the split robust by using
rsplit("/", 1) and assign to repo_base and repo_name_only (keep references to
uid and hsh as-is), so update the function signature and replace
repo_name.lower().split("/") with repo_name.lower().rsplit("/", 1) and add a
guard that raises ValueError with a helpful message when repo_name is invalid.

🧹 Nitpick comments (1)

nemo_skills/dataset/swe-bench-pro/prepare.py (1)

45-73: Allow an explicit output path to avoid writing into package directories.

When this script is run from an installed package, Path(__file__).parent may be read-only. Adding an --output_file option keeps the default behavior but avoids permission failures.

🛠️ Proposed tweak

     parser.add_argument(
         "--dataset_name",
         type=str,
         default="ScaleAI/SWE-bench_Pro",
         help="Dataset name to load",
     )
+    parser.add_argument(
+        "--output_file",
+        type=Path,
+        default=None,
+        help="Path to write JSONL. Defaults to <script_dir>/<setup>.jsonl.",
+    )
     args = parser.parse_args()
@@
-    output_file = Path(__file__).parent / f"{args.setup}.jsonl"
+    output_file = (
+        Path(args.output_file)
+        if args.output_file is not None
+        else Path(__file__).parent / f"{args.setup}.jsonl"
+    )

nemo_skills/dataset/swe-bench-pro/prepare.py

Signed-off-by: wasiahmad <wasiahmad@ucla.edu>

Signed-off-by: Nikolai Ludwig <nliudvig@nvidia.com>

Signed-off-by: wasiahmad <wasiahmad@ucla.edu>

adding dataset prep

88b6942

Signed-off-by: wasiahmad <wasiahmad@ucla.edu>

wasiahmad requested a review from ludwig-n January 28, 2026 23:52

wasiahmad marked this pull request as draft January 28, 2026 23:52

greptile-apps bot reviewed Jan 28, 2026

View reviewed changes

nemo_skills/dataset/swe-bench-pro/prepare.py Outdated Show resolved Hide resolved

coderabbitai bot reviewed Jan 28, 2026

View reviewed changes

nemo_skills/dataset/swe-bench-pro/prepare.py Outdated Show resolved Hide resolved

wasiahmad and others added 18 commits January 30, 2026 12:12

Merge branch 'main' into wasiahmad/swe-pro

ae51b64

Merge branch 'main' into wasiahmad/swe-pro

b36fd83

removing default param value for repo_name

a651cb0

Signed-off-by: wasiahmad <wasiahmad@ucla.edu>

updating dataset prep

13ce056

Signed-off-by: wasiahmad <wasiahmad@ucla.edu>

updating dataset prep

815f1ea

Signed-off-by: wasiahmad <wasiahmad@ucla.edu>

updating dataset prep

22c06f8

Signed-off-by: wasiahmad <wasiahmad@ucla.edu>

Downgrade cryptography during OH install to fix missing glibc 2.33

80a6354

Signed-off-by: Nikolai Ludwig <nliudvig@nvidia.com>

Merge branch 'main' into wasiahmad/swe-pro

a0528e5

minor updates

987bf0b

Signed-off-by: wasiahmad <wasiahmad@ucla.edu>

cp /app to /testbed for swe-pro

ef01bfe

Signed-off-by: wasiahmad <wasiahmad@ucla.edu>

attempting to resolve poetry related issues

1b83003

Signed-off-by: wasiahmad <wasiahmad@ucla.edu>

rolling back

3605d25

Signed-off-by: wasiahmad <wasiahmad@ucla.edu>

fixing cp command for eval outputs

bd354a0

Signed-off-by: wasiahmad <wasiahmad@ucla.edu>

fixing python version issue

eeb9df4

Signed-off-by: wasiahmad <wasiahmad@ucla.edu>

rolling back to python 3.12 use

5e18ea4

Signed-off-by: wasiahmad <wasiahmad@ucla.edu>

compatible python installation

f452c08

Signed-off-by: wasiahmad <wasiahmad@ucla.edu>

compatible python installation

34d0018

Signed-off-by: wasiahmad <wasiahmad@ucla.edu>

rolling back

8260ddf

Signed-off-by: wasiahmad <wasiahmad@ucla.edu>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrating SWE-Pro (Public) Dataset Eval #1197

Integrating SWE-Pro (Public) Dataset Eval #1197
wasiahmad wants to merge 19 commits intomainfrom
wasiahmad/swe-pro

wasiahmad commented Jan 28, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

greptile-apps bot left a comment

Uh oh!

Uh oh!

coderabbitai bot commented Jan 28, 2026

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

wasiahmad commented Jan 28, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot commented Jan 28, 2026

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

wasiahmad commented Jan 28, 2026 •

edited by coderabbitai bot

Loading