-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Forcing response_format
to json
#983
Comments
I think it does |
cc @langchain4j |
@andreadimaio do you want to work on this? |
Yes, I'll open a new PR |
🙏🏽 |
The implementation is a little more complex than what I have in mind, for the simple reason that OpenAi also has another If the |
I'm not an expert on the OpenAi APIs, but I think that if |
@andreadimaio it works like this in vanilla LC4j. If schema can be passed, we do not append extra instructions |
But schema is now supported only by OpenAI and Gemini |
There's something that's not clear to me. Looking at class DefaultAiServices.java there are these lines: Response < AiMessage > response;
if (supportsJsonSchema && jsonSchema.isPresent()) {
ChatRequest chatRequest = ChatRequest.builder()
.messages(messages)
.toolSpecifications(toolSpecifications)
.responseFormat(ResponseFormat.builder()
.type(JSON)
.jsonSchema(jsonSchema.get())
.build())
.build();
ChatResponse chatResponse = context.chatModel.chat(chatRequest);
response = new Response < > (
chatResponse.aiMessage(),
chatResponse.tokenUsage(),
chatResponse.finishReason()
);
} else {
// TODO migrate to new API
response = toolSpecifications == null ?
context.chatModel.generate(messages) :
context.chatModel.generate(messages, toolSpecifications);
} The Another note is about the default implementation of the chat method. It has all the parameters to call the |
Or your idea is to use |
I was planning to add another Capability for json_object. It should be easy as we know which providers support Json mode. Regarding the default implementation of the chat method, you're right, it should call the generate methods. I actually implemented it this way initially, but then rolled back because I had some doubts about it. This is work in progress, I plan to get back to this new API soon. Eventually generate methods will be deprecated and providers will need to implement only one method: chat. |
Chat method is used only when Json capability is present because I had to rush this new chat API in order to enable structured outputs. Otherwise there was no way to pass the schema. WIP... |
Thank you! @geoand what do you suggest to do regarding the implementation of this functionality in |
You can go ahead and do that here and when it feature langs in LangChain4j we can utilize it |
I've been thinking about it and I am considering using tools (function calling) instead of JSON mode when return type is POJO and Structured Outputs feature is not supported (e.g. when LLM provider is not OpenAI or Gemini):
This is how it can work: if (isStructuredOutputType(methodReturnType)) { // e.g. POJO, enum, List<T>/Set<T>, etc.
if (chatModel.supportedCapabilities().contains(RESPONSE_FORMAT_JSON_SCHEMA)) {
// Proceed with generating JSON schema and passing it to the model using structured outputs feature.
// This will work for OpenAI and Gemini.
} else if (chatModel.supportedCapabilities().contains(TOOLS)) {
// Create synthetic tool "answer" and generate JSON schema for it
if (configuredTools.isEmpty()) {
// The "answer" is the only tool, so we will *force* the model to call this tool using tool_mode LLM parameter (will be available in the new ChatModel API)
} else {
// There are other tools that user has configured. It means that LLM could/should use one or multiple of them before providing the final answer.
// I am not sure yet what is the best solution in this case. For example, we could add "final_answer" to the list of tools and hope that LLM will use it to provide the answer.
// We could also append a hint to the prompt (e.g. "Use final_answer tool to provide a final answer").
// Or we could call the LLM in the loop (if LLM decides to call tools) untill it returns a final answer in plain text, and then call it again only with "answer" tool available and force it to call it with tool_mode parameter.
// There can be multiple strategies and we could make this configurable for the user.
// Please note that this is probably pretty rate use case (when user needs both structured outputs and tools).
}
} else {
// Fallback to appending "You must answer strictly in the following format..." to the prompt.
}
} WDYT? |
We can also make "what to use to get structured output from LLM" as a configurable strategy that user can specify explicitly (e.g. |
I am not entirely sure how the tools would enforce valid JSON generation from the LLM in this context. Is the primary role of the tools to generate the schema, or is it used to handle response formatting after the model has generated the output?
I think we need to be cautious about tools' functionality. Some model providers, like Ollama, support tools, but not for all the hosted models. In these cases, the
Regarding the JSON mode, I think that combined with the
I agree. Having this as a configurable option is ideal from my perspective. However, we should be careful with |
If the provider returns an error when trying to use a model that does not support tools, the consideration I made about using |
When LLM support tools, you can provide a JSON schema and LLM will generate a valid JSON that follows the schema (in like 95% of cases ,depending on the complexity of schema). LC4j generates JSON schema from a
Good point, this is why we should make this behavior configurable.
I agree that JSON mode works pretty good, but tools are more reliable than JSON mode by design. JSON mode feature just "guarantees" (in 95% of the times) that the returned text is a valid JSON. One can provide JSON schema in the free-form in the prompt, but there is no guarantee that LLM will follow it. Tools, on the other hand, "guarantee" (again, 95%) that returned text is not only a valid JSON, but also follows the specified schema. And in this case schema is specified in a standartized way (as a separate LLM request parameter) and not appended as free-form text to the user message. Since tool-calling LLMs are tuned to follow the schema, and there is only a single way to specify the schema, it is mroe reliable than appending schema as a free-form text to the user prompt.
Good point! I guess this concern is applicable mostly for Ollama, as all other LLM providers that support tools, usually support them for all their models (at least I see this trend lately). Ollama throws an error in case tools are not supported by specific model: |
In part, I want to understand the actual use of tools to solve this problem. I have something in mind, but I don't know if we're on the same page. Suppose I have an LLM that needs to extract some user info, and this is the output pojo: record User(String firstName, String lastName) {} Your idea is to have a tool method like this to generate the correct JSON? @Tool("Generates a response in the required JSON format.")
public User answer(String firstName, String lastName) {
return new User(firstName, lastName);
}
👍 |
@andreadimaio no, the idea is to automatically create ToolSpecification.builder()
.name("answer")
.addParameter("firstName")
.addParameter("lastName")
.build() under the hood of AI Service and inject it in the request to the LLM and force it to use this tool. In this case user does not have to do anything, LLM will be forced to reply by calling |
Python version of LC is actually using tool calling as a primary way for structured outputs: https://python.langchain.com/docs/how_to/structured_output/#the-with_structured_output-method |
In this case tools are kind of "misused" for returning structured output |
Yes, it was just an example, of course everything will be automatic. So we are on the same page :) |
Just to document this explicitly, here is the order (from best to worst) of strategies to get structured outputs is AI services (this is not implemented yet, jsut a plan):
User should be able to override this logic and explicitly specify which strategy to use. |
The
ChatLanguageModel
interface provides a new method that can be implemented to force the use ofresponse_format
tojson
when an AiService method returns a pojo. This is something that can be done automatically by quarkus.This should be a simple change to the
AiServiceMethodImplementationSupport
class, but all current providers will need to be updated to manage this new method.Does this make sense?
The text was updated successfully, but these errors were encountered: