-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add GuardRails for Tool input and output #990
Comments
Thanks for reporting! cc @cescoffier |
I like the idea. I think it would require dedicated interfaces as the parameters and behavior would be slightly different. For the output one, we need to design the resilience patterns we want. You already mentioned retry, but I'm wondering if we need to call the method again or ask the LLM to recompute the tool execution request. |
I would opt for the LLM to recompute (retry) and have the option to provide a message (like "tool output contained customer email address, make sure to not use this tool to divulge private information" or whatever you want to check for). Unless I'm overlooking something, I think that logic to call the method again (with same parameters) could be handle within the tool itself, if the output isn't to satisfaction. There is a related discussion on tool calling going on in langchain4j core repo langchain4j/langchain4j#1997 |
So, right now we have the following sequence of messages:
When the tool execution failed, what do we have?
The question is about where to insert the guardrails:
|
As far as I understand, guardrails act on inputs and outputs to/from AI Service?
I guess cases like this (wrong tool name, wrong tool parameter name or type, etc) should be handled by
In this case we should probably implement some "max retries" mechanism to make sure smaller LLMs don't go into an endless loop. |
We also need to distinguish between different types of issues here:
|
Thanks @langchain4j ! That is exactly what I was looking for!
A pre-tools guardrail could handle this and decide what to do.
Yes, a pre-tools guardrail can handle this case.
We could imagine having a post-tools guardrail that can update the message and "guide" the LLM
Yes, a post-tools guardrail can handle this. |
@cescoffier are there pre- and post-tools guardrails already? Or is this just a concept?
If we go this way, this should be an out-of-the-box guardrail that users could just use without the need to implement it themselves |
It's just a concept for now. As I modify how Quarkus invokes tools, I can easily implement it - well except maybe in the virtual thread case.
Yes, or having a default strategy or disabling it when guardrails are used. It's not clear what's best for now. |
The GuardRails are really awesome! It would be nice if we could also have them available to perform a check before executing a tool for these reasons:
It would also be nice to have a GuardRail option for the Tool output, eg. to have a final check that no private user info is divulged, etc.
The text was updated successfully, but these errors were encountered: