-
Notifications
You must be signed in to change notification settings - Fork 167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Please help me test tool-use (function calling) #514
Comments
@agzam -- in case you're interested. |
Oh wow, this is very cool. So many interesting ideas to try. I'm excited, will give it a go. Thank you! |
Excellent. Will give it a go. |
Wow, this works great. I just created tools for being able to cat files, ls, and creating new files. I had trouble with OpenAI calling the tools but Claude Sonnet works fine. I even had Claude write a couple of tools, then eval the emacs lisp blocks inside the org mode buffer, and have Claude immediately start using them.
|
Holy bootstrap, Batman!
Did it throw an error or just ignore the tools? If it was silent failure you can check the |
It seems to be calling the tools but fails at the end. It also failed with gemini, llama, and qwen for me but I have to double check my tools because for simpler use cases I think it was working a while ago. This same prompt works fine with the Claude models sonnet and haiku. Here is the Messages buffer and attached is the log |
It seems to be calling the tools but fails at the end. It also failed with gemini, llama, and qwen for me but I have to double check my tools because for simpler use cases I think it was working a while ago. This same prompt works fine with the Claude models sonnet and haiku.
Here is the *Messages* buffer and attached is the log
[gptel-tool-use-log-openai.txt](https://github.com/user-attachments/files/18225771/gptel-tool-use-log-openai.txt)
:
`
Querying OpenAI...
gptel: moving from INIT to WAIT
gptel: moving from WAIT to TYPE
gptel: moving from TYPE to TOOL
error in process sentinel: let: Wrong type argument: stringp, nil
error in process sentinel: Wrong type argument: stringp, nil
Strange, that data stream does not look like what I'd expect at all. As a result, the tool call from OpenAI is not being parsed correctly. I'll look into this.
Independent of that, it looks like there's an error in gptel's tool call handler. Could you repeat this experiment but after M-x toggle-debug-on-error, and paste the backtrace here?
|
Update: it seems Gemini, Llama, and Qwen models work only if I make a request that requires a single tool call. For example I did a request to each to summarize a URL and do a directory listing on my local machine and these types of interactions work. |
Could you try it with this OpenAI backend? (gptel-make-openai "openai-with-parallel-tool-calls"
:key YOUR_OPENAI_API_KEY
:stream t
:models gptel--openai-models
:request-params '(:parallel_tool_calls t)) Parallel tool calls are supposed to be enabled by default, so I'm not expecting that this will work, but it would be wise to verify. |
Could you also share the tool definitions you used in this failed request? I'd like to try reproducing the error here. |
For ollama, I see the tool being sent to ollama in the gptel log buffer, but none of the models ever actually seem to use the tools. Have tried with Mistral Nemo, Qwen 2.5, Mistral Small, Llama 3.2 vision. |
@ProjectMoon Could you share the tools you wrote so I can try to reproduce these issues? |
Just a copy and paste of the example one. I will try again at some point in the coming days with ollama debugging mode turned on to see what reaches the server. Edit: also I need to test with a direct ollama connection. This might be (and probably is) a bug in Open WebUI's proxied ollama API. |
I get these types of errors: Attached are my gptel config files, the regular one I use and a minimal version I made for testing tools using your suggestion "openai-with-parallel-tool-calls". |
@jester7 Thanks for the tool definitions. I've fixed parallel tool calls for the Claude and OpenAI-compatible APIs. Please update and test both the streaming and non-streaming cases. You can turn off streaming with Parallel tool calls with the Gemini and Ollama APIs are still broken. All these APIs validate their inputs differently, and the docs don't contain the validation schema so adding tool calls is a long crapshoot. Still, we truck on. |
Update: Parallel tool calls with Gemini works too, but only as long as all function calls involve arguments. Zero-arity functions like |
OK, definitely seems to be more a problem with OpenWebUI's proxied Ollama API... although it was supposedly resolved to be able to pass in structured inputs. I will have to dig into the source code to see if it even does anything with the tools parameter. I was able to make a tool call when connecting directly to the ollama instance using Mistral Nemo. Edit: Yep doesn't have the tools param in the API, so it's discarded silently. |
Thanks. When we merge this we should add a note to the README about the tool-use incompatibility with OpenWebUI. |
Parallel tool calls now work with Ollama too, but you have to disable streaming. Ollama does not support tool calls with streaming responses. |
I've updated the opening post with a status table I'll keep up to date. |
So I added the tools parameter to OpenWebUI (it was just adding a single line to the chat completions form class, it seems). Then I get a response back from the proxied ollama API containing the tool call to use. But unlike when connecting directly, gptel seems to do nothing. Looking at the elisp code, the only thing that makes sense is the content from the OWUI response being non-empty, but both OWUI response and the direct connection response have |
So I added the tools parameter to OpenWebUI (it was just adding a single line to the chat completions form class, it seems). Then I get a response back from the proxied ollama API containing the tool call to use. But unlike when connecting directly, gptel seems to do nothing. Looking at the elisp code, the only thing that makes sense is the content from the OWUI response being non-empty, but both OWUI response and the direct connection response have `"content": ""` o_O
Is there any other difference between the OWUI and direct connection in the response log?
Also you can click on the status in the header-line (the text that says "Waiting..." or "Ready...") to see the process state. It will help to know if the :state is recorded as DONE, ERRS or TOOL.
|
I’m only recently picking emacs back up, the last time I regularly used it was before tools like gpt existed. But if I understand correctly: Since the tool results aren’t echoed to the chat buffer created by If that is the case, perhaps it would be nice to provide a way to help people capture tool results to the context tooling provided by gptel? Or maybe have the tool results echoed to the chat buffer (perhaps in a folded block?) |
> 3. Same as 2, but suggest ways that the feature can be improved, especially in the UI department.
I’m only recently picking emacs back up, the last time I regularly used it was before tools like gpt existed. But if I understand correctly: Since the tool results aren’t echoed to the chat buffer created by `m-x gptel`, and if `gptel-send` sends the buffer contents before the point, then won’t any added context that results from a tool call get dropped from the conversation in the next message round?
If that is the case, perhaps it would be nice to provide a way to help people capture tool results to the context tooling provided by gptel? Or maybe have the tool results echoed to the chat buffer (perhaps in a folded block?)
That's a good point. I've had this problem already in an exchange like the following:
Prompt: Summarize the article at <some link> for me
Response: Sure, let me fetch that article first. <tool call happens here>
<summary follows>
Prompt: Why do they believe X (referring to the article here)
Response: Let me fetch the article again <tool call happens here>
<Answer to my question>
So Claude had to fetch the article again after each response. We need some generic way to include the tool results (when relevant) in chat buffers. Of course, tool use isn't always about the results or back-and-forth conversations -- for many tools the tool call output is either meaningless or irrelevant. So we probably also need a way to identify which tool results need to be echoed to buffers.
|
@ProjectMoon This can happen if you have streaming turned on, since Ollama (I will eventually handle this internally, where streaming is automatically |
I've added tool selection support to gptel's transient interface: Pressing Selecting a category (like "filesystem" or "emacs" here) will toggle all the tools in that category. Tool selection can be done globally, buffer-locally or for the next request only using the Scope ( This makes it much more convenient to select the right set of tools for the task at hand. (LLMs get confused if you include a whole bunch of irrelevant tools.) I've also updated the opening post above with the tool definitions you see in the above image. You can grab them from there and evaluate them. I'm not sure yet if gptel should include any tools by default. |
And here is a demo of using the filesystem toolset to make the LLM do something that's otherwise annoying to do: gptel-tool-use-filesystem-demo.mp4 |
@karthink finally got around to testing a bunch of this. Wasn't able to replicate the feature to see the internal state via clicking the header line, but the gptel log at info level shows that it responds with a tool call. The trick was forcing non-streaming on the client side, in my gptel configuration. Forcing streaming on the server didn't help. But setting stream to nil when running against a modified version of Open-WebUI allowed it to work. A regular version of Open-WebUI does not work at all because of the missing tools option in the API payload (though it seems to be a one-line PR, so I may submit a change to them). |
You might need to update gptel to get the introspection feature, I've been |
Only just got around to testing this. Couple of thoughts I had:
|
See #484.
Yes, both of these were planned at the start, and have been implemented locally. They can be specified both per tool (in the definition) and per call (in the transient menu). I'll push them to this branch eventually. That said, updates will be slow again as I'm now out of time to work on gptel. |
@meain Do you know how getting LLMs to edit files works? Is the LLM given the file and asked to generate a diff, or generate the new version of the file, or generate only a changed region? The one tool that seems very useful that I don't know how to write is the edit-file action. |
TLDR: Depends on the model I've mostly seen aider's diff format like things getting used, but for some models we might have to ask it to generate full file. Edit formats page in aider might be worth looking into. Aider mostly uses the whole or diff format IIUC. Some models like 4o-mini does not work well with diff format and As for a generic option for edit, I've seen packages(cline for example) try asking the model to produce |
Note
Current status of tool-use:
I've added tool-use/function calling support to all major backends in gptel -- OpenAI-compatible, Claude, Ollama and Gemini.
Demos
screencast_20241222T075329.mp4
gptel-tool-use-filesystem-demo.mp4
Call to action
Please help me test it! It's on the
feature-tool-use
branch. There are multiple ways in which you can help. Ranked from least to most intensive:Switch to the
feature-tool-use
branch and just use gptel as normal -- no messing around with tool use. Adding tool use required a significant amount of reworking in gptel's core, so it will help to catch any regressions first. (Remember to reinstall/re-byte-compile the package after switching branches!)Switch to the branch, define a tool or two, and try using gptel (instructions below). Let me know if something breaks.
Same as 2, but suggest ways that the feature can be improved, especially in the UI department.
What is "tool use"?
"Tool use" or "function calling" is LLM usage where
You can use this to give the LLM awareness of the world, by providing access to APIs, your filesystem, web search, Emacs etc. You can get it to control your Emacs frame, for instance.
How do I enable it in gptel?
There are three steps:
Use a model that supports tool use. Most of the big OpenAI/Anthropic/Google models do, as do llama3.1 and the newer mistral models if you're using Ollama.
(setq gptel-use-tools t)
Write tool definitions. See the documentation of
gptel-make-tool
. Here is an example of a tool definition:Tool definition example
And here are a few simple tools for Filesystem/Emacs/Web access. You can copy and evaluate them in your Emacs session:
Code:
Some tool definitions, copy to your Emacs
An async tool to fetch youtube metadata using yt-dlp
As seen in gptel's menu:
See the documentation for
gptel-make-tool
for details on the keyword arguments.Tip
@jester7 points out that you can get the LLM to write these tool definitions for you, and eval the Org Babel blocks to use them right away.
Important
Please share tools you write below so I can use them to test for issues.
In this case, the LLM may choose to ask for a call to
get_weather
if your question is related to the weather, as in the above demo video. You can help it along by saying something like:Notes
gptel-make-tool
for an example.nil
if you want to run it for side-effects only.The text was updated successfully, but these errors were encountered: