Skip to content

Commit 2591ae4

Browse files
ww2283andylizf
andauthored
Document prompt template feature in README (#172)
* Add prompt template feature to README Highlights performance optimization with task-specific prompt templates. Includes real-world benchmark data showing EmbeddingGemma 300M achieving 4-5x speed improvement over Qwen 600M while maintaining identical search quality. Per maintainer request to promote this feature in main README for better discoverability. * Fix typo: --embedding-prompt-template -> --query-prompt-template --------- Co-authored-by: Andy Lee <[email protected]>
1 parent 1ca0d3f commit 2591ae4

File tree

2 files changed

+45
-1
lines changed

2 files changed

+45
-1
lines changed

README.md

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -191,6 +191,35 @@ chat = LeannChat(INDEX_PATH, llm_config={"type": "hf", "model": "Qwen/Qwen3-0.6B
191191
response = chat.ask("How much storage does LEANN save?", top_k=1)
192192
```
193193

194+
## Performance Optimization: Task-Specific Prompt Templates
195+
196+
LEANN now supports prompt templates for task-specific embedding models like Google's EmbeddingGemma. This feature enables **significant performance gains** by using smaller, faster models without sacrificing search quality.
197+
198+
### Real-World Performance
199+
200+
**Benchmark (MacBook M1 Pro, LM Studio):**
201+
- **EmbeddingGemma 300M (QAT)** with templates: **4-5x faster** than Qwen 600M
202+
- **Search quality:** Identical ranking to larger models
203+
- **Use case:** Ideal for real-time workflows (e.g., pre-commit hooks in Claude Code; ~7min for whole LEANN's code + doc files on MacBook M1 Pro)
204+
205+
### Quick Example
206+
207+
```bash
208+
# Build index with task-specific templates
209+
leann build my-index ./docs \
210+
--embedding-mode ollama \
211+
--embedding-model embeddinggemma \
212+
--embedding-prompt-template "title: none | text: " \
213+
--query-prompt-template "task: search result | query: "
214+
215+
# Search automatically applies query template
216+
leann search my-index "How does LEANN optimize vector search?"
217+
```
218+
219+
Templates are automatically persisted and applied during searches (CLI, MCP, API). No manual configuration needed after indexing.
220+
221+
See [Configuration Guide](docs/configuration-guide.md#task-specific-prompt-templates) for detailed usage and model recommendations.
222+
194223
## RAG on Everything!
195224

196225
LEANN supports RAG on various data sources including documents (`.pdf`, `.txt`, `.md`), Apple Mail, Google Search History, WeChat, ChatGPT conversations, Claude conversations, iMessage conversations, and **live data from any platform through MCP (Model Context Protocol) servers** - including Slack, Twitter, and more.

docs/configuration-guide.md

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -185,10 +185,25 @@ leann search my-docs \
185185
--embedding-prompt-template "task: search result | query: "
186186
```
187187

188+
A full example that is used for building the LEANN's repo during dev:
189+
```
190+
source "$LEANN_PATH/.venv/bin/activate" && \
191+
leann build --docs $(git ls-files | grep -Ev '\.(png|jpg|jpeg|gif|yml|yaml|sh|pdf|JPG)$') --embedding-mode openai \
192+
--embedding-model text-embedding-embeddinggemma-300m-qat \
193+
--embedding-prompt-template "title: none | text: " \
194+
--query-prompt-template "task: search result | query: " \
195+
--embedding-api-key local-dev-key \
196+
--embedding-api-base http://localhost:1234/v1 \
197+
--doc-chunk-size 1024 --doc-chunk-overlap 100 \
198+
--code-chunk-size 1024 --code-chunk-overlap 100 \
199+
--ast-chunk-size 1024 --ast-chunk-overlap 100 \
200+
--force --use-ast-chunking --no-compact --no-recompute
201+
```
202+
188203
**Important Notes:**
189204
- **Only use with compatible models**: EmbeddingGemma and similar task-specific models
190205
- **NOT for regular models**: Adding prompts to models like `nomic-embed-text`, `text-embedding-3-small`, or `bge-base-en-v1.5` will corrupt embeddings
191-
- **Template is saved**: Build-time templates are saved to `.meta.json` for reference
206+
- **Template is saved**: Build-time templates are saved to `.meta.json` for reference; you can add both `--embedding-prompt-template` and `--query-prompt-template` values during building phase, and this way the mcp query will automatically pick up the query template
192207
- **Flexible prompts**: You can use any prompt string, or leave it empty (`""`)
193208

194209
**Python API:**

0 commit comments

Comments
 (0)