Skip to content

Release BigCodeBench v0.2.3.post1

Latest
Compare
Choose a tag to compare
@terryyz terryyz released this 01 Feb 04:21

What's Changed

  • Fix Docker image and its dependencies
  • Support more models with reasoning effort
  • Optional chat prefilling
  • E2B, Gradio, and Local code execution

Evaluated LLMs (173 models)

  • o3-mini
  • DeepSeek R1

Full Changelog: v0.2.1.post7...v0.2.3.post1