Skip to content

Integrating SWE-Pro (Public) Dataset Eval #1197

Draft
wasiahmad wants to merge 19 commits intomainfrom
wasiahmad/swe-pro
Draft

Integrating SWE-Pro (Public) Dataset Eval #1197
wasiahmad wants to merge 19 commits intomainfrom
wasiahmad/swe-pro

Conversation

@wasiahmad
Copy link
Collaborator

@wasiahmad wasiahmad commented Jan 28, 2026

Summary by CodeRabbit

Release Notes

  • New Features
    • Added SWE-bench Pro dataset integration with automated data preparation and normalization pipeline.
    • Included default configuration constants for evaluation metrics, dataset grouping, and generation workflows.
    • Enhanced dataset records with Docker container metadata and dataset identification fields.
    • Provided command-line configurable parameters for customizing dataset setup and container handling.

✏️ Tip: You can customize this high-level summary in your review settings.

Signed-off-by: wasiahmad <wasiahmad@ucla.edu>
@wasiahmad wasiahmad requested a review from ludwig-n January 28, 2026 23:52
@wasiahmad wasiahmad marked this pull request as draft January 28, 2026 23:52
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 28, 2026

📝 Walkthrough

Walkthrough

Introduces a new SWE-bench Pro dataset package with an __init__.py module containing five evaluation configuration constants and a prepare.py script that loads the dataset, normalizes fields, augments records with container metadata and identifiers, then exports to JSONL format.

Changes

Cohort / File(s) Summary
SWE-bench Pro Dataset Package
nemo_skills/dataset/swe-bench-pro/__init__.py, nemo_skills/dataset/swe-bench-pro/prepare.py
Added module-level configuration constants (EVAL_SPLIT, DATASET_GROUP, METRICS_TYPE, GENERATION_MODULE, GENERATION_ARGS) and implemented get_dockerhub_image_uri() function for constructing Docker image tags. Main script loads dataset, normalizes repo_language to language, computes container formatter from row data, appends container_id, dataset_name, and split metadata, then writes enriched records to JSONL.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~15 minutes

Possibly related PRs

  • Add apex-shortlist dataset #1080: Introduces a similar dataset package structure with matching module-level evaluation constants and a corresponding prepare.py script for JSONL dataset export.

Suggested reviewers

  • gwarmstrong
  • ludwig-n
🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Integrating SWE-Pro (Public) Dataset Eval' directly relates to the main changes: adding configuration and preparation files for the SWE-bench Pro dataset integration.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@nemo_skills/dataset/swe-bench-pro/prepare.py`:
- Around line 24-26: The get_dockerhub_image_uri function currently uses
repo_name.lower().split("/") which will fail if repo_name defaults to "" — make
repo_name required (remove the default "") or validate and raise a clear error
if it's empty/doesn't contain a slash; also make the split robust by using
rsplit("/", 1) and assign to repo_base and repo_name_only (keep references to
uid and hsh as-is), so update the function signature and replace
repo_name.lower().split("/") with repo_name.lower().rsplit("/", 1) and add a
guard that raises ValueError with a helpful message when repo_name is invalid.
🧹 Nitpick comments (1)
nemo_skills/dataset/swe-bench-pro/prepare.py (1)

45-73: Allow an explicit output path to avoid writing into package directories.

When this script is run from an installed package, Path(__file__).parent may be read-only. Adding an --output_file option keeps the default behavior but avoids permission failures.

🛠️ Proposed tweak
     parser.add_argument(
         "--dataset_name",
         type=str,
         default="ScaleAI/SWE-bench_Pro",
         help="Dataset name to load",
     )
+    parser.add_argument(
+        "--output_file",
+        type=Path,
+        default=None,
+        help="Path to write JSONL. Defaults to <script_dir>/<setup>.jsonl.",
+    )
     args = parser.parse_args()
@@
-    output_file = Path(__file__).parent / f"{args.setup}.jsonl"
+    output_file = (
+        Path(args.output_file)
+        if args.output_file is not None
+        else Path(__file__).parent / f"{args.setup}.jsonl"
+    )

wasiahmad and others added 18 commits January 30, 2026 12:12
Signed-off-by: wasiahmad <wasiahmad@ucla.edu>
Signed-off-by: wasiahmad <wasiahmad@ucla.edu>
Signed-off-by: wasiahmad <wasiahmad@ucla.edu>
Signed-off-by: wasiahmad <wasiahmad@ucla.edu>
Signed-off-by: Nikolai Ludwig <nliudvig@nvidia.com>
Signed-off-by: wasiahmad <wasiahmad@ucla.edu>
Signed-off-by: wasiahmad <wasiahmad@ucla.edu>
Signed-off-by: wasiahmad <wasiahmad@ucla.edu>
Signed-off-by: wasiahmad <wasiahmad@ucla.edu>
Signed-off-by: wasiahmad <wasiahmad@ucla.edu>
Signed-off-by: wasiahmad <wasiahmad@ucla.edu>
Signed-off-by: wasiahmad <wasiahmad@ucla.edu>
Signed-off-by: wasiahmad <wasiahmad@ucla.edu>
Signed-off-by: wasiahmad <wasiahmad@ucla.edu>
Signed-off-by: wasiahmad <wasiahmad@ucla.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants