Improve tool type support #1047

Tarjei400 · 2024-11-04T20:12:53Z

Added support for mapping to complex types in ai service methods.
Not sure if I didn't miss anything, but I due to interfaces on langchain4j side, it seems like we have to have our own version for both json schema and parser methods.

As an addition, I noticed there is a @description annotation for adding more context to llm about individual fields
and added it to description node of json schema, for llm to better understand how to output data.

@description

…d data from @description annotations to json schema, for llm to better understand how to output data.

Tarjei400 · 2024-11-04T20:15:33Z

#671

geoand · 2024-11-05T06:43:41Z

This is pretty useful indeed!

But given that all this operates on java.lang.reflect.Type, would it not make sense to be part of upstream LangChain4j?

Tarjei400 · 2024-11-05T08:49:12Z

@geoand I’m not sure. My thought was that there might always be cases requiring special handling, beyond what upstream supports or offers, and an extension could have its own strategy to address those.

On another note, I was trying to add support for Kotlin nullable types to be treated as optional. However, I couldn’t find an easy way to extract that information from reflection on the Java side. Without kotlin-reflect, I assume this isn't possible, whether it’s on Quarkus or upstream. Am I correct?

geoand · 2024-11-05T09:19:33Z

@geoand I’m not sure. My thought was that there might always be cases requiring special handling, beyond what upstream supports or offers, and an extension could have its own strategy to address those.

That is certainly true, but your changes in this case seem to be applicable to the library as well, no?

On another note, I was trying to add support for Kotlin nullable types to be treated as optional. However, I couldn’t find an easy way to extract that information from reflection on the Java side. Without kotlin-reflect, I assume this isn't possible, whether it’s on Quarkus or upstream. Am I correct?

I don't really remember this one to TBH, but I am pretty sure your assessment is correct

Tarjei400 · 2024-11-05T09:32:56Z

That is certainly true, but your changes in this case seem to be applicable to the library as well, no?

Yes, this could be added to the langchain4j upstream library. However, since Codec is being overridden anyway and I got the impression that we wanted to extend QuarkusServiceOutputParser, I assumed this was the correct approach. That said, it might be worth coordinating with upstream on this.

On the other hand, this opens up more possibilities for allowing users to define their own OutputParsers. The next step, I’d guess, would be adjusting the deployment to enable users to register their own output parsers, complementing the current parser mechanism.

geoand · 2024-11-05T09:36:31Z

Yes, this could be added to the langchain4j upstream library. However, since Codec is being overridden anyway and I got the impression that we wanted to extend QuarkusServiceOutputParser, I assumed this was the correct approach. That said, it might be worth coordinating with upstream on this.

Right, I have the feeling that this specific change is warranted for upsteeam.

On the other hand, this opens up more possibilities for allowing users to define their own OutputParsers. The next step, I’d guess, would be adjusting the deployment to enable users to register their own output parsers, complementing the current parser mechanism.

Right, both things can be true :)

Tarjei400 · 2024-11-05T10:42:31Z

@geoand I fixed the test, I will leave this pr opened until it is decided what to do with it then.

geoand · 2024-11-05T10:53:28Z

Thanks!

@langchain4j WDYT about these changed being upstream?

langchain4j · 2024-11-05T11:08:23Z

Yes, I wanted to handle optionality for quite a while now, would be great to have it in upstream!

Tarjei400 · 2024-11-05T12:01:00Z

@langchain4j @geoand I have couple of issues on my end here still, but in spare time I will try to prepare a pr to upstream then!

langchain4j · 2024-11-05T12:20:03Z

@Tarjei400 great, thank you in advance!

Considering that current structured output implementations in Quarkus extension and upstream is different, what will be the scope of the PR? As far as I understand only support for Optional* types in POJOs (when AI service returns a POJO).
For both text instructions and JsonSchema (supported only by OpenAI and Gemini for now)?

geoand · 2024-11-05T12:26:44Z

🙏

Tarjei400 · 2024-11-05T14:01:48Z

@langchain4j Quarkus is using this as codec for mapping :

  public interface JsonCodec {
    String toJson(Object var1);

    <T> T fromJson(String var1, Class<T> var2);

    InputStream toInputStream(Object var1, Class<?> var2) throws IOException;
  }

I see it is deprecated but not sure if favor for what though. But ideally instead of Class<?> it would support Type.

On the other hand, what I tried to achieve is to make mapping to POJO from response (Including cases when returnType is a collection or nested collections of objects), but into instruction passing constructed JsonSchema based on class definition - I guess those 2 parts we could fit into upstream? I overloaded QuarkusServiceOutputParser.parse and QuarkusServiceOutputParser.outputFormatInstructions for this just in case.

langchain4j · 2024-11-05T14:08:46Z

@Tarjei400 Please ignore deprecation for now.
Please note that I have added <T> T fromJson(String json, Type type); in #1873.

On the other hand, what I tried to achieve is to make mapping to POJO from response (Including cases when returnType is a collection or nested collections of objects), but into instruction passing constructed JsonSchema based on class definition - I guess those 2 parts we could fit into upstream?

Not sure I follow, please elaborate. BTW there is also langchain4j/langchain4j#1938 in progress which seems to be overlapping with what you describe?

Tarjei400 · 2024-11-05T14:25:34Z

@langchain4j Yes this pr seems to tackle same problem, as with one difference structures like List<List< POJO > > should be also supported I think ? I mean jackson is capable to resolve such cases given json and JavaType

BTW there is also langchain4j/langchain4j#1938 in progress which seems to be overlapping with what you describe?

Oh, so most likely we should bump version quarkus then to cover for that.

langchain4j · 2024-11-05T14:52:30Z

@Tarjei400 those changes are not released yet (planning to release next week).

In theory, List<List<POJO>> should work as well, but I did not check it. Jackson can convert it for sure, but json schema should also support it. Please note that there are currently 2 ways how structured outputs work in upstream:

Using JsonSchema (supported by OpenAI and Gemini)
Appending instructions in plain text to the prompt (works always)

That PR addresses only fist case

On a side note, Optionals are not supported at all, so I guess it would be a minimum scope for the PR.

Tarjei400 · 2024-11-05T15:04:27Z

@langchain4j I will try to branch out from that PR to see what I can do about optional then. As for this pr @geoand I think it would be best to wait for upstream to release, bump the version and see what we can do from there.

geoand · 2024-11-05T15:29:08Z

Makes sense to me!

geoand · 2024-11-28T08:17:07Z

@langchain4j I will try to branch out from that PR to see what I can do about optional then. As for this pr @geoand I think it would be best to wait for upstream to release, bump the version and see what we can do from there.

We have since bumped to 0.36 :)

Added support mapping to to complex types in ai service methods. Adde…

bd88ad9

…d data from @description annotations to json schema, for llm to better understand how to output data.

Tarjei400 requested a review from a team as a code owner November 4, 2024 20:12

Tarjei400 marked this pull request as draft November 4, 2024 21:16

Tarjei400 force-pushed the main branch from d3303c9 to 0e7d0e6 Compare November 4, 2024 22:24

geoand changed the title ~~Added support mapping to to complex types in ai service methods. Added data from @Description annotations to json schema, for llm to better understand how to output data.~~ Improve tool support Nov 5, 2024

geoand changed the title ~~Improve tool support~~ Improve tool type support Nov 5, 2024

Tarjei400 force-pushed the main branch from 0e7d0e6 to ab6121f Compare November 5, 2024 10:41

Tarjei400 marked this pull request as ready for review November 5, 2024 10:41

Tarjei400 force-pushed the main branch from ab6121f to 0f5e158 Compare November 5, 2024 10:45

Add handling for Optional types

24a45f0

Tarjei400 force-pushed the main branch from 0f5e158 to 24a45f0 Compare November 5, 2024 16:26

Tarjei400 mentioned this pull request Nov 5, 2024

Trying to use token-window memory type with easy rag results in exception #1036

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve tool type support #1047

Improve tool type support #1047

Tarjei400 commented Nov 4, 2024 •

edited

Loading

Tarjei400 commented Nov 4, 2024

geoand commented Nov 5, 2024

Tarjei400 commented Nov 5, 2024

geoand commented Nov 5, 2024 •

edited

Loading

Tarjei400 commented Nov 5, 2024

geoand commented Nov 5, 2024

Tarjei400 commented Nov 5, 2024

geoand commented Nov 5, 2024

langchain4j commented Nov 5, 2024

Tarjei400 commented Nov 5, 2024

langchain4j commented Nov 5, 2024

geoand commented Nov 5, 2024

Tarjei400 commented Nov 5, 2024 •

edited

Loading

langchain4j commented Nov 5, 2024

Tarjei400 commented Nov 5, 2024 •

edited

Loading

langchain4j commented Nov 5, 2024

Tarjei400 commented Nov 5, 2024 •

edited

Loading

geoand commented Nov 5, 2024

geoand commented Nov 28, 2024

Improve tool type support #1047

Are you sure you want to change the base?

Improve tool type support #1047

Conversation

Tarjei400 commented Nov 4, 2024 • edited Loading

Tarjei400 commented Nov 4, 2024

geoand commented Nov 5, 2024

Tarjei400 commented Nov 5, 2024

geoand commented Nov 5, 2024 • edited Loading

Tarjei400 commented Nov 5, 2024

geoand commented Nov 5, 2024

Tarjei400 commented Nov 5, 2024

geoand commented Nov 5, 2024

langchain4j commented Nov 5, 2024

Tarjei400 commented Nov 5, 2024

langchain4j commented Nov 5, 2024

geoand commented Nov 5, 2024

Tarjei400 commented Nov 5, 2024 • edited Loading

langchain4j commented Nov 5, 2024

Tarjei400 commented Nov 5, 2024 • edited Loading

langchain4j commented Nov 5, 2024

Tarjei400 commented Nov 5, 2024 • edited Loading

geoand commented Nov 5, 2024

geoand commented Nov 28, 2024

Tarjei400 commented Nov 4, 2024 •

edited

Loading

geoand commented Nov 5, 2024 •

edited

Loading

Tarjei400 commented Nov 5, 2024 •

edited

Loading

Tarjei400 commented Nov 5, 2024 •

edited

Loading

Tarjei400 commented Nov 5, 2024 •

edited

Loading