Skip to content

Conversation

@helloml0326
Copy link
Collaborator

@helloml0326 helloml0326 commented Jan 15, 2026

OpenJudge Version

[The version of OpenJudge you are working on, e.g. import openjudge; print(openjudge.__version__)]

Description

[Please describe the background, purpose, changes made, and how to test this PR]

Checklist

Please check the following items before code is ready to be reviewed.

  • Code has been formatted with pre-commit run --all-files command
  • All tests are passing
  • Docstrings are in Google style
  • Related documentation has been updated (e.g. links, examples, etc.)
  • Code is ready for review

…uation

- Add ToolCallSequenceMatchSimpleGrader supporting precision/recall metrics
- Support flexible matching with/without arguments
- Add comprehensive test suite with 21 test cases
- Update documentation in overview.md and agent_graders.md
- Emphasize ToolCallSequenceMatchGrader for multi-step complex scenarios
RECALL = "recall"


class ToolCallSequenceMatchSimpleGrader(BaseGrader):
Copy link
Collaborator

@weizhang25 weizhang25 Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

class ToolCallSequenceExpectationGrader ?
"simple" is not necessary in the class name if it describes the "simple" nature of this grader.
"Simple" does not help people understand this grader at all.

@helloml0326 helloml0326 changed the title Feature/add ToolCallSequenceMatchSimpleGrader [WIP]Feature/add ToolCallSequenceMatchSimpleGrader Jan 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants