bug: OOM issue - [0.5.10] RAG feature failed with `Engine Is Not Loaded Yet` error #4207

imtuyethan · 2024-12-03T14:06:57Z

Jan version

0.5.10

Describe the Bug

https://discord.com/channels/1107178041848909847/1313496019475894363

When using the Retrieval (RAG) feature with PDF files, the engine fails to load with error code 322122505 Engine Is Not Loaded Yet. This prevents PDF analysis functionality from working.

Steps to Reproduce

Set up Embedding Model as Llama 3.2 1B Instruct Q8
Set up Vector Database as HNSWLib
Upload a PDF file (issue persists with files as small as 27.51KB)
Attempt to use RAG/PDF analysis feature

Error Messages

"Engine Is Not Loaded Yet"
"llama_decode_internal: invalid token"
"llama_decode: failed to decode"
"Internal error catched Input prompt is too big compared to KV size"

Logs

Error code: 322122505 from cortex-server.exe process

Expected behavior

Engine should load successfully
PDF should be processed for RAG functionality

Screenshots / Logs

Logs
message (4).txt
app (2).log
cortex (2).log

Device specs

Machine: x86
OS: Windows (based on .exe reference)
Memory: 32.00 GB (24.84 GB used)
CPU Usage: 18%

What is your OS?

MacOS
Windows
Linux

The text was updated successfully, but these errors were encountered:

louis-jan · 2024-12-03T14:12:19Z

That is an OOM issue, right? @vansangpfiev @nguyenhoangthuan99
ERROR llama_decode_internal: invalid token[0] = 1074917552 - llama_engine.cc:493
ERROR llama_decode: failed to decode, ret = -1 - llama_engine.cc:493
Failed to decode the batch: KV cache is full - try increasing it via the context size: i = 0, n_batch = 2048, ret = -1 - llama_server_context.cc:1653
Internel error catched Input prompt is too big compared to KV size. Please try increasing KV size. - llama_server_context.cc:862
Internel error catched Input prompt is too big compared to KV size. Please try increasing KV size. - llama_server_context.cc:862
ERROR Internel error catched Input prompt is too big compared to KV size. Please try increasing KV size. - llama_server_context.cc:862
ERROR Internel error catched Input prompt is too big compared to KV size. Please try increasing KV size. - llama_server_context.cc:862

imtuyethan added the type: bug Something isn't working label Dec 3, 2024

github-project-automation bot added this to Jan & Cortex Dec 3, 2024

github-project-automation bot moved this to Investigating in Jan & Cortex Dec 3, 2024

imtuyethan added category: tools RAG, web search, files, function calling category: experimental features labels Dec 3, 2024

imtuyethan assigned louis-jan Dec 3, 2024

louis-jan changed the title ~~bug: [0.5.10] RAG feature failed with Engine Is Not Loaded Yet error~~ bug: OOM issue - [0.5.10] RAG feature failed with Engine Is Not Loaded Yet error Dec 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: OOM issue - [0.5.10] RAG feature failed with `Engine Is Not Loaded Yet` error #4207

bug: OOM issue - [0.5.10] RAG feature failed with `Engine Is Not Loaded Yet` error #4207

imtuyethan commented Dec 3, 2024 •

edited

Loading

louis-jan commented Dec 3, 2024

bug: OOM issue - [0.5.10] RAG feature failed with Engine Is Not Loaded Yet error #4207

bug: OOM issue - [0.5.10] RAG feature failed with Engine Is Not Loaded Yet error #4207

Comments

imtuyethan commented Dec 3, 2024 • edited Loading

Jan version

Describe the Bug

Steps to Reproduce

Screenshots / Logs

What is your OS?

louis-jan commented Dec 3, 2024

bug: OOM issue - [0.5.10] RAG feature failed with `Engine Is Not Loaded Yet` error #4207

bug: OOM issue - [0.5.10] RAG feature failed with `Engine Is Not Loaded Yet` error #4207

imtuyethan commented Dec 3, 2024 •

edited

Loading