Documentation added for model handler #1067

Aktsvigun · 2025-06-17T06:05:19Z

Documentation added for the model_handler part of the BFCL benchmark

HuanzhiMao · 2025-06-17T06:50:28Z

berkeley-function-call-leaderboard/bfcl_eval/model_handler/api_inference/claude.py

        super().__init__(model_name, temperature)
        self.model_style = ModelStyle.Anthropic
        self.client = Anthropic(api_key=os.getenv("ANTHROPIC_API_KEY"))

-    def decode_ast(self, result, language="Python"):
+    def decode_ast(self, result: str, language: str="Python") -> list[dict[str, dict]]:


result is not guaranteed to be a string. When in prompting mode, it is indeed a string. However, when in FC mode, it would be a list of dictionaries; you can infer from lines 58-60 (cited below).

for invoked_function in result: name = list(invoked_function.keys())[0] params = json.loads(invoked_function[name])

Same issue in decode_execute

HuanzhiMao · 2025-06-17T07:23:23Z

berkeley-function-call-leaderboard/bfcl_eval/model_handler/api_inference/claude.py

+        Pre-processes test data for function calling queries.
+
+        Args:
+            inference_data (dict): Inference data to process


The current description of inference_data isn’t quite right.

Purpose of the function – It first pre-processes the given test_entry before sending it to the model, e.g.

converts the raw user message into the format the model expects

extracts the system prompt, if one exists

performs any other required data transformations

Role of inference_data – It’s simply a shared context dictionary passed between handler methods. This function only initialises that dict with a few default keys; it does not populate it with the processed dataset fields.

Where the real work happens – All dataset-specific processing is done in-place on the test_entry dict. Nothing is currently copied into inference_data.

HuanzhiMao · 2025-06-17T07:30:51Z

Hey @Aktsvigun ,

One thing I'm thinking: Because each concrete handler exposes the same methods with identical parameters and behavior, it might be cleaner to keep the detailed documentation in BaseHandler rather than duplicating it everywhere. IntelliSense should be able to surface the parent docstring when we hover over the overridden methods, so developers won’t lose any context. For any extra helper functions unique to a given handler, we can certainly document those in place.

What do you think? Does that approach sound reasonable to you?

Aktsvigun · 2025-06-17T08:37:35Z

Agree, I didn't know they share the same exact methods! I'll relaunch, thank you for your comments here.

Documentation added for model handler

ae0f728

Aktsvigun mentioned this pull request Jun 17, 2025

Documentation added #1051

Closed

HuanzhiMao added the BFCL-General General BFCL Issue label Jun 17, 2025

HuanzhiMao reviewed Jun 17, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Documentation added for model handler #1067

Documentation added for model handler #1067

Uh oh!

Aktsvigun commented Jun 17, 2025

Uh oh!

HuanzhiMao Jun 17, 2025 •

edited

Loading

Uh oh!

HuanzhiMao Jun 17, 2025

Uh oh!

HuanzhiMao commented Jun 17, 2025

Uh oh!

Aktsvigun commented Jun 17, 2025

Uh oh!

Uh oh!

Documentation added for model handler #1067

Are you sure you want to change the base?

Documentation added for model handler #1067

Uh oh!

Conversation

Aktsvigun commented Jun 17, 2025

Uh oh!

HuanzhiMao Jun 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HuanzhiMao Jun 17, 2025

Choose a reason for hiding this comment

Uh oh!

HuanzhiMao commented Jun 17, 2025

Uh oh!

Aktsvigun commented Jun 17, 2025

Uh oh!

Uh oh!

HuanzhiMao Jun 17, 2025 •

edited

Loading