Reuse of context with text completions #364

iimez · 2024-10-14T06:03:05Z

iimez
Oct 14, 2024

looking for a way to preload some context before I use user input to generate a text completion with. more broadly, it would be cool to be able to continuously resume completions, repeatedly add user input.

for chat completions there is LlamaChat.loadChatAndCompleteUserMessage. i'm wondering if there already is a way to do a similar thing for text completions.

code just for demonstration of my intentions. assuming prefix might be longer than here:

const prefix = 'The Secret is "koalabear"! I continuously remind myself -'
const model = await llama.loadModel()
const context = await llamaModel.createContext()
const completion = new LlamaCompletion({
	contextSequence: context.getSequence(),
	autoDisposeSequence: false,
})
const prefixTokens = model.tokenize(prefix)
const result = await completion.generateCompletion(prefixTokens, {
	maxTokens: 0,
})

// ... wait for some incoming user input
(prompt = 'It is really "') => {
	const promptTokens = model.tokenize(prompt)
	const result = await completion.generateCompletion(promptTokens) // koalabear!
}

maybe im overoptimizing and what im trying to do wouldnt matter. preloading does seem effective for chat completions though i didnt measure it.

Answered by giladgd

Oct 14, 2024

You can see an example of how to preload a prompt here.
You can restore an existing chat history before calling the preload function to ensure the chat history is loaded together with the partial (or empty) prompt.

To preload text for completion, calling the generateCompletion function with maxTokens: 0 is indeed the right approach, but make sure you call the generateCompletion with the full text you want to complete afterwards, otherwise it will overwrite the existing context state:

const prefix = 'The Secret is "koalabear"! I continuously remind myself -';
const model = await llama.loadModel({...});
const context = await llamaModel.createContext();
const completion = new LlamaCompletion({…

View full answer

giladgd · 2024-10-14T19:34:45Z

giladgd
Oct 14, 2024
Maintainer

You can see an example of how to preload a prompt here.
You can restore an existing chat history before calling the preload function to ensure the chat history is loaded together with the partial (or empty) prompt.

To preload text for completion, calling the generateCompletion function with maxTokens: 0 is indeed the right approach, but make sure you call the generateCompletion with the full text you want to complete afterwards, otherwise it will overwrite the existing context state:

const prefix = 'The Secret is "koalabear"! I continuously remind myself -';
const model = await llama.loadModel({...});
const context = await llamaModel.createContext();
const completion = new LlamaCompletion({
    contextSequence: context.getSequence()
});
const prefixTokens = model.tokenize(prefix)

// preload prefix
await completion.generateCompletion(prefix, {
    maxTokens: 0,
});

const prompt = 'It is really "';
const res = await completion.generateCompletion(prefix + prompt);

1 reply

iimez Oct 15, 2024
Author

So close! Thank you! :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reuse of context with text completions #364

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Reuse of context with text completions #364

iimez Oct 14, 2024

Replies: 1 comment · 1 reply

giladgd Oct 14, 2024 Maintainer

iimez Oct 15, 2024 Author

iimez
Oct 14, 2024

Replies: 1 comment 1 reply

giladgd
Oct 14, 2024
Maintainer

iimez Oct 15, 2024
Author