Integrating SWE-Pro (Public) Dataset Eval #1197
Conversation
Signed-off-by: wasiahmad <wasiahmad@ucla.edu>
📝 WalkthroughWalkthroughIntroduces a new SWE-bench Pro dataset package with an Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~15 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@nemo_skills/dataset/swe-bench-pro/prepare.py`:
- Around line 24-26: The get_dockerhub_image_uri function currently uses
repo_name.lower().split("/") which will fail if repo_name defaults to "" — make
repo_name required (remove the default "") or validate and raise a clear error
if it's empty/doesn't contain a slash; also make the split robust by using
rsplit("/", 1) and assign to repo_base and repo_name_only (keep references to
uid and hsh as-is), so update the function signature and replace
repo_name.lower().split("/") with repo_name.lower().rsplit("/", 1) and add a
guard that raises ValueError with a helpful message when repo_name is invalid.
🧹 Nitpick comments (1)
nemo_skills/dataset/swe-bench-pro/prepare.py (1)
45-73: Allow an explicit output path to avoid writing into package directories.When this script is run from an installed package,
Path(__file__).parentmay be read-only. Adding an--output_fileoption keeps the default behavior but avoids permission failures.🛠️ Proposed tweak
parser.add_argument( "--dataset_name", type=str, default="ScaleAI/SWE-bench_Pro", help="Dataset name to load", ) + parser.add_argument( + "--output_file", + type=Path, + default=None, + help="Path to write JSONL. Defaults to <script_dir>/<setup>.jsonl.", + ) args = parser.parse_args() @@ - output_file = Path(__file__).parent / f"{args.setup}.jsonl" + output_file = ( + Path(args.output_file) + if args.output_file is not None + else Path(__file__).parent / f"{args.setup}.jsonl" + )
Signed-off-by: wasiahmad <wasiahmad@ucla.edu>
Signed-off-by: wasiahmad <wasiahmad@ucla.edu>
Signed-off-by: wasiahmad <wasiahmad@ucla.edu>
Signed-off-by: wasiahmad <wasiahmad@ucla.edu>
Signed-off-by: Nikolai Ludwig <nliudvig@nvidia.com>
Signed-off-by: wasiahmad <wasiahmad@ucla.edu>
Signed-off-by: wasiahmad <wasiahmad@ucla.edu>
Signed-off-by: wasiahmad <wasiahmad@ucla.edu>
Signed-off-by: wasiahmad <wasiahmad@ucla.edu>
Signed-off-by: wasiahmad <wasiahmad@ucla.edu>
Signed-off-by: wasiahmad <wasiahmad@ucla.edu>
Signed-off-by: wasiahmad <wasiahmad@ucla.edu>
Signed-off-by: wasiahmad <wasiahmad@ucla.edu>
Signed-off-by: wasiahmad <wasiahmad@ucla.edu>
Signed-off-by: wasiahmad <wasiahmad@ucla.edu>
Summary by CodeRabbit
Release Notes
✏️ Tip: You can customize this high-level summary in your review settings.