Continuous decoding feature #858
Replies: 6 comments 1 reply
-
Could you please share more detains on this requirement? Do you want the generator object to serve for multiple inputs instead of one? if so, we are adding it. |
Beta Was this translation helpful? Give feedback.
-
@yufenglee Yes, if possible, I would like to modify the input of the generator before the next prediction without having the recreate the generator for every input modification. Would that be possible with what is currently being developed? |
Beta Was this translation helpful? Give feedback.
-
I'm guessing the idea is that if you having a chat with the LLM you want to add the conversation so far to the end of the input and add a new question. |
Beta Was this translation helpful? Give feedback.
-
@WA225 and @elephantpanda, yes we are working on the continuous decoding feature, which will allow you to do this. The feature will be available in the next release |
Beta Was this translation helpful? Give feedback.
-
Modified the title to continuous decoding |
Beta Was this translation helpful? Give feedback.
-
We are actively working on continuous decoding and a PR is coming soon. |
Beta Was this translation helpful? Give feedback.
-
Describe the bug
I am wondering if there an API (C or python but preferably python) that allows us to modify the generator input without the need to recreate the generator.
I do not see any API that can do that in the documentation, but it would be really helpful to have it.
Beta Was this translation helpful? Give feedback.
All reactions