Why does RAG agent example from Getting Started seem to stop retrieving any context from documents? #995
-
I followed https://llama-stack.readthedocs.io/en/latest/getting_started/index.html When I got to execute RAG agent example: https://llama-stack.readthedocs.io/en/latest/getting_started/index.html#your-first-rag-agent I see the following behavior.
See:
I found out a workaround for this is removing the faiss_store.db
After that, the script seems to work again, for a number of executions.
Then eventually, after 3 reasonable answers, it's back to the context-less greetings.
It feels like maybe inference touches the kvstore somehow? (I see the db file changes its checksum but not sure if it's not some runtime metadata for sqlite that is of no importance to embeddings.) Does the output make sense? Shouldn't answers be independent of the number of times the agent code is executed? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
Thanks for raising this, its because we update the same faiss index ( vector db ) with the same document chunks. Which means eventually, when we ask for context -- we get the same chunk again and again and that might be completely messing up the model response. #998 fixes this. |
Beta Was this translation helpful? Give feedback.
-
Thanks for the fix! Confirmed it resolves the issue. |
Beta Was this translation helpful? Give feedback.
Thanks for raising this, its because we update the same faiss index ( vector db ) with the same document chunks. Which means eventually, when we ask for context -- we get the same chunk again and again and that might be completely messing up the model response.
#998 fixes this.