Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RequestMessage with tool_calls in Assistant message. #793

Open
Jeadie opened this issue Sep 25, 2024 · 4 comments
Open

RequestMessage with tool_calls in Assistant message. #793

Jeadie opened this issue Sep 25, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@Jeadie
Copy link
Contributor

Jeadie commented Sep 25, 2024

Describe the bug

  • To follow OpenAI spec (and what other open source models, e.g. llama3.1), expect for Assistant messages with tool calls, messages should look like this
  "messages": [
    {
      "role": "assistant",
      "content": "",
     "tool_calls": [
        {
          "id": "call-243240c8",
          "type": "function",
          "function": {
            "name": "current_weather",
            "arguments": "{\"county\": \"somewhere\"}"
          }
        }
      ]
    }
]

In mistral.rs, we JSON string the full "tool_calls" value, e.g.

messages.push(IndexMap::from([
("role".to_string(), Either::Left("asistant".to_string())),
(
"content".to_string(),
Either::Left(
json!({
"name": called.function.name,
"parameters": called.function.arguments,
})
.to_string(),
),
),
]));

In fact, MessageContent only supports String values.

This restricts the ability to feed the result of tool calls back to various models

Also value should be "arguments" and not "parameters" (parameters is used in JSONSchema for tools key.

@Jeadie Jeadie added the bug Something isn't working label Sep 25, 2024
@EricLBuehler
Copy link
Owner

Hi @Jeadie! Thanks for raising the issue. I was wondering if you could just modify the code like:

messages.push(IndexMap::from([ 
     ("role".to_string(), Either::Left("asistant".to_string())), 
     ("content".to_string(), Either::Left("".to_string())), 
     ( 
         "tool_calls".to_string(), 
         Either::Left( 
             json!({ 
                 "name": called.function.name, 
                 "arguments": called.function.arguments, 
             }) 
             .to_string(), 
         ), 
     ), 
 ]));

I think I'm a bit confused on the limitation here, as any JSON objects would need to be stringified eventually. Perhaps the above would work?

@Jeadie
Copy link
Contributor Author

Jeadie commented Sep 30, 2024

The issue is that certain common open-source Jinja chat_templates expect a non-serialised JSON object for tool_calls fields when converting RequestMessages to an unstructured string.

For example meta-llama/Llama-3.1-8B-Instruct has a chat_template, with the following

{%- if not message.tool_calls|length == 1 %}
    {{- raise_exception(\"This model only supports single tool-calls at once!\") }}
{%- endif %}

This is not checking the length of the JSON string is 1, but rather that the JSON array has one entry.

@Jeadie
Copy link
Contributor Author

Jeadie commented Oct 4, 2024

Initial comments were incorrect, the example needs to be.

tool_calls = messages.push(IndexMap::from([ 
     ("role".to_string(), Either::Left("asistant".to_string())), 
     ("content".to_string(), Either::Left("".to_string())), 
     ( 
         "tool_calls".to_string(), 
         Either::Right(Vec![
             IndexMap::from([ 
                 ("Id".to_string(), tool_call.id),
                 ("function".to_string(), json!({ 
                     "name": called.function.name, 
                     "arguments": called.function.arguments, 
                 }),
                 ("type".to_string(), "function".to_string()"),
             ])
        ])
     ), 
 ]));

The key difference is to use Either::Right instead of Either::Left for the "tool_calls" key. This resolves the issue mentioned above.

However, this still does not resolve all issues found in trying to use the chat_template from meta-llama/Llama-3.1-8B-Instruct. The next issue in the chat template.

  {{- '<|start_header_id|>assistant<|end_header_id|>\n\n' -}}
  {{- '{"name": "' + tool_call.name + '", ' }}
  {{- '"parameters": ' }}
  {{- tool_call.arguments | tojson }}
  {{- "}" }}

This part of the chat_template is attempting access fields within the JSON value of tool_calls[*].function. This fails in the above example, because we serialise it:

json!({ 
    "name": called.function.name, 
    "arguments": called.function.arguments, 
})

This is a direct limitation of MessageContent.

pub type MessageContent = Either<String, Vec<IndexMap<String, String>>>;

The final IndexMap<String, String> requires the elements of each tool_calls to have String values. From the example above, these are

IndexMap::from([ 
    ("Id".to_string(), tool_call.id),
    ("function".to_string(), json!({ 
        "name": called.function.name, 
        "arguments": called.function.arguments, 
    }),
    ("type".to_string(), "function".to_string()"),
])

@Jeadie
Copy link
Contributor Author

Jeadie commented Oct 4, 2024

The impact of this issue is that mistral.rs can call tools, but cannot take the result of tools as inputs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants