Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can the result be reformed into the one for SWE-bench evaluation? #75

Open
27yw opened this issue Nov 11, 2024 · 2 comments
Open

How can the result be reformed into the one for SWE-bench evaluation? #75

27yw opened this issue Nov 11, 2024 · 2 comments

Comments

@27yw
Copy link

27yw commented Nov 11, 2024

Very wonderful work.
I notice that swe-bench evaluation requires files including

eval.sh: The evaluation script
patch.diff: The model's generated prediction
report.json: Summary of evaluation outcomes for this instance
run_instance.log: A log of SWE-bench evaluation steps
test_output.txt: An output of running eval.sh on patch.diff

And in auto code rover we only get the json and patch.diff
how can we get test_output.txt?

Thanks a lot!

@27yw 27yw changed the title How can the result form into the one for SWE-bench evaluation? How can the result be reformed into the one for SWE-bench evaluation? Nov 11, 2024
@crhf
Copy link
Collaborator

crhf commented Nov 11, 2024

Hi! You would need to first transform the json into jsonl (with a simple python script for example), then evaluate the jsonl with SWE-bench's containerized evaluation. Then in SWE-bench/logs/ you will find these files.

@minhnhatle104
Copy link

Hi @crhf, When I run AutoCodeRover on SWE-lite ( using docker image). I receive a file predictions_for_swebench.json

You mean using this file --> transform to jsonl --> evalute with SWE-bench containerized evaluation
For example:

python -m swebench.harness.run_evaluation \
    --dataset_name princeton-nlp/SWE-bench_Lite \
    --predictions_path  **predictions_for_swebench.jsonl**\
    --max_workers 1
   --run_id evalution

the field --predictions will be predictions_for_swebench.jsonl. Is it correct ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants