Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: prompt_template in Citation query #17517

Open
Aradina25 opened this issue Jan 15, 2025 · 4 comments
Open

[Bug]: prompt_template in Citation query #17517

Aradina25 opened this issue Jan 15, 2025 · 4 comments
Labels
bug Something isn't working triage Issue needs to be triaged/prioritized

Comments

@Aradina25
Copy link

Bug Description

I wanted to add system prompt to my code in llama index along with Citation query. None of the dcoumentation explains how to do it. (I am not talking about citation prompt but system prompt)

So i tried the following

index = VectorStoreIndex(nodes)
 index.storage_context.persist(persist_dir=f"./citation")
 custom_citation_qa_template = PromptTemplate(
                    f"{prompt}\n"
                    "When referencing information from a source, "
                    "cite the appropriate source(s) using their corresponding numbers. "
                    "Every answer should include at least one source citation. "
                    "Only cite a source when you are explicitly referencing it."
                    "For example, cite sources in square brackets like [1][2][3]."
                    "If none of the sources are helpful, indicate that.\n"
                    "Query: {query_str}\n"
                    "Answer: "
                )
query_engine = CitationQueryEngine.from_args(index, similarity_top_k=3, citation_chunk_size=256,citation_qa_template=custom_citation_qa_template)
result = query_engine.query(query)

And this is giving completely wrong answers even for a simple question like what is the title of the document
But the below code works instead

index = VectorStoreIndex(nodes)
index.storage_context.persist(persist_dir=f"./citation")
query_engine = CitationQueryEngine.from_args(index, similarity_top_k=3, citation_chunk_size=256)
query_text = f"{prompt}. The question is {query}. When referencing information from a source, you must cite the appropriate source(s) using their corresponding numbers. Every answer should include at least one source citation. Only cite a source when you are explicitly referencing it. For example, cite sources in square brackets like [1][2][3]."
result = query_engine.query(query_text)

Version

0.10.52

Steps to Reproduce

prompt= "You are a helpful assistant designed to answer questions based only on the content provided in the document.When answering, refer exclusively to the document's text and avoid introducing any information not present in it. If the document does not contain sufficient information to answer the query, respond by saying 'Question is out of context'"

text = "your text"

document = [Document(text=text)]
node_parser = SentenceSplitter(chunk_size=1024, chunk_overlap=20)
nodes = node_parser.get_nodes_from_documents(document, show_progress=False)
Settings.llm = Ollama(client=client, model=ai_model, request_timeout=360.0, embed_model='local')
Settings.embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
index = VectorStoreIndex(nodes)
index.storage_context.persist(persist_dir=f"./citation")
 custom_citation_qa_template = PromptTemplate(
                    f"{prompt}\n"
                    "When referencing information from a source, "
                    "cite the appropriate source(s) using their corresponding numbers. "
                    "Every answer should include at least one source citation. "
                    "Only cite a source when you are explicitly referencing it."
                    "For example, cite sources in square brackets like [1][2][3]."
                    "If none of the sources are helpful, indicate that.\n"
                    "Query: {query_str}\n"
                    "Answer: "
)
query_engine = CitationQueryEngine.from_args(index, similarity_top_k=3, citation_chunk_size=256,citation_qa_template=custom_citation_qa_template)
result = query_engine.query("what is the title of the document")

Relevant Logs/Tracbacks

@Aradina25 Aradina25 added bug Something isn't working triage Issue needs to be triaged/prioritized labels Jan 15, 2025
Copy link

dosubot bot commented Jan 15, 2025

Hey @Aradina25! I'm here to help you with any bugs, questions, or contributions you have. Let's tackle this together!

I found a similar discussion that might be relevant to your issue: Does CitationQueryEngine need a prompt template when running a .query?. However, it is currently unsolved [1].

To address your issue, it seems like the discrepancy in results might be due to how the PromptTemplate is being used with the CitationQueryEngine. When you embed the prompt directly in the query text, it seems to work correctly, possibly because the prompt is being processed as part of the query itself, which might align better with how the engine interprets the input.

When using a custom citation_qa_template, ensure that the template is correctly formatted and that the placeholders like {query_str} are being replaced with the actual query string. The PromptTemplate should be designed to guide the generation of answers based on provided sources, ensuring that each answer includes at least one source citation [2][3].

If the custom template is not yielding the expected results, you might want to verify that the template's structure aligns with the requirements of the CitationQueryEngine, such as ensuring that the context and query strings are correctly formatted and integrated into the template. Additionally, consider testing with a simpler template to isolate the issue and gradually build up to the desired complexity.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

@logan-markewich
Copy link
Collaborator

What is the "prompt" variable holding? That's likely having a large influence on the performance?

This is the default

If you want a system prompt, use a ChatPromptTemplate

from llama_index.core.llms import ChatMessage
from llama_index.core.prompts import ChatPromptTemplate

template = ChatPromptTemplate(
  ChatMessage(role="system", content="some prompt"),
  ChatMessage(role="user", content="some prompt with expected {variables}")
)

One example
https://docs.llamaindex.ai/en/stable/examples/customization/prompts/chat_prompts/#2-call-chatprompttemplatefrom_messages

@Aradina25
Copy link
Author

Aradina25 commented Jan 15, 2025

@logan-markewich

What is the "prompt" variable holding? That's likely having a large influence on the performance?

This is the default

llama_index/llama-index-core/llama_index/core/query_engine/citation_query_engine.py

Line 28 in 3f7e66e

CITATION_QA_TEMPLATE = PromptTemplate(
If you want a system prompt, use a ChatPromptTemplate

from llama_index.core.llms import ChatMessage
from llama_index.core.prompts import ChatPromptTemplate

template = ChatPromptTemplate(
  ChatMessage(role="system", content="some prompt"),
  ChatMessage(role="user", content="some prompt with expected {variables}")
)

One example https://docs.llamaindex.ai/en/stable/examples/customization/prompts/chat_prompts/#2-call-chatprompttemplatefrom_messages

But I am still confused how to use this with citation query.
query_engine = CitationQueryEngine.from_args(index, similarity_top_k=3, citation_chunk_size=256,citation_qa_template=template)
like this?

@logan-markewich
Copy link
Collaborator

@Aradina25 yea exactly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage Issue needs to be triaged/prioritized
Projects
None yet
Development

No branches or pull requests

2 participants