Skip to content
@SWE-bench

SWE-bench

Organization for maintaining the SWE-bench/agent projects

SWE-bench

This organization contains the source code for SWE-bench, a benchmark for evaluating AI systems on real world GitHub issues.

Use the repositories in this organization to...

Also check out related organizations

  • SWE-bench-repos: Mirror clones for repositories used for SWE-bench style evalautions.
  • SWE-agent: Solve GitHub issue(s) automatically with a Language Model powered agent!

Pinned Loading

  1. SWE-bench Public

    SWE-bench [Multimodal]: Can Language Models Resolve Real-world Github Issues?

    Python 2.9k 485

  2. experiments Public

    Open sourced predictions, execution logs, trajectories, and results from model inference + evaluation runs on the SWE-bench task.

    Shell 168 173

  3. sb-cli Public

    Run SWE-bench evaluations remotely

    Python 10

  4. swe-bench.github.io Public

    Landing page + leaderboard for SWE-Bench benchmark

    HTML 4 6

Repositories

Showing 6 of 6 repositories
  • SWE-bench Public

    SWE-bench [Multimodal]: Can Language Models Resolve Real-world Github Issues?

    Python 2,853 MIT 485 36 7 Updated Apr 22, 2025
  • sb-cli Public

    Run SWE-bench evaluations remotely

    Python 10 MIT 0 3 0 Updated Apr 18, 2025
  • swe-bench.github.io Public

    Landing page + leaderboard for SWE-Bench benchmark

    HTML 4 6 2 2 Updated Mar 31, 2025
  • experiments Public

    Open sourced predictions, execution logs, trajectories, and results from model inference + evaluation runs on the SWE-bench task.

    Shell 168 173 6 13 Updated Mar 31, 2025
  • .github Public
    0 0 0 0 Updated Feb 25, 2025
  • humanevalfix-results Public archive

    Evaluation data + results for SWE-agent inference on HumanEvalFix task

    Jupyter Notebook 0 0 0 0 Updated Jul 11, 2024