Skip to content

Conversation

Aktsvigun
Copy link

Documentation added for the model_handler part of the BFCL benchmark

@Aktsvigun Aktsvigun mentioned this pull request Jun 17, 2025
@HuanzhiMao HuanzhiMao added the BFCL-General General BFCL Issue label Jun 17, 2025
super().__init__(model_name, temperature)
self.model_style = ModelStyle.Anthropic
self.client = Anthropic(api_key=os.getenv("ANTHROPIC_API_KEY"))

def decode_ast(self, result, language="Python"):
def decode_ast(self, result: str, language: str="Python") -> list[dict[str, dict]]:
Copy link
Collaborator

@HuanzhiMao HuanzhiMao Jun 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

result is not guaranteed to be a string. When in prompting mode, it is indeed a string. However, when in FC mode, it would be a list of dictionaries; you can infer from lines 58-60 (cited below).

for invoked_function in result:
    name = list(invoked_function.keys())[0]
    params = json.loads(invoked_function[name])

Same issue in decode_execute

Pre-processes test data for function calling queries.

Args:
inference_data (dict): Inference data to process
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current description of inference_data isn’t quite right.

  • Purpose of the function – It first pre-processes the given test_entry before sending it to the model, e.g.

    • converts the raw user message into the format the model expects
    • extracts the system prompt, if one exists
    • performs any other required data transformations
  • Role of inference_data – It’s simply a shared context dictionary passed between handler methods. This function only initialises that dict with a few default keys; it does not populate it with the processed dataset fields.

  • Where the real work happens – All dataset-specific processing is done in-place on the test_entry dict. Nothing is currently copied into inference_data.

@HuanzhiMao
Copy link
Collaborator

Hey @Aktsvigun ,

One thing I'm thinking: Because each concrete handler exposes the same methods with identical parameters and behavior, it might be cleaner to keep the detailed documentation in BaseHandler rather than duplicating it everywhere. IntelliSense should be able to surface the parent docstring when we hover over the overridden methods, so developers won’t lose any context. For any extra helper functions unique to a given handler, we can certainly document those in place.

What do you think? Does that approach sound reasonable to you?

@Aktsvigun
Copy link
Author

Agree, I didn't know they share the same exact methods! I'll relaunch, thank you for your comments here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BFCL-General General BFCL Issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants