Local RAG

Tutorial: Using `nomic-embed-text` with Ollama, Chroma, and `deepseek-r1:1.5b` Locally

Overview

This tutorial demonstrates how to:

Embed exercise science text with nomic-embed-text (v1.5) via Ollama.
Store embeddings in Chroma, a local vector database.
Query the embeddings and use deepseek-r1:1.5b to generate answers, stripping any <think> sections—all offline.

Sample Exercise Science Text

The dataset:

"Strength training involves resistance exercises designed to increase muscle mass, improve muscular endurance, and enhance overall physical performance. Common methods include weightlifting with barbells or dumbbells, bodyweight exercises like push-ups, and resistance band workouts."
"Cardiovascular exercise, such as running, cycling, or swimming, boosts heart health by improving circulation, increasing lung capacity, and reducing the risk of chronic diseases like hypertension and diabetes."
"Stretching and mobility exercises, including yoga and dynamic warm-ups, enhance joint range of motion, reduce injury risk, and improve posture by counteracting the stiffness caused by sedentary lifestyles."
"Post-exercise recovery is critical for performance gains. Techniques like foam rolling, adequate sleep, and proper nutrition—especially protein intake—help repair muscle fibers and reduce soreness."
"High-Intensity Interval Training (HIIT) alternates short bursts of intense exercise with rest periods, maximizing calorie burn and improving aerobic capacity in less time than traditional steady-state cardio."

Prerequisites

System: Python 3.8+ with pip.
Ollama: Installed (from ollama.com, version 0.1.26+ recommended).
Hardware: Decent CPU (GPU optional).
Dependencies:
```
pip install ollama chromadb
```

Step 1: Set Up Ollama

Start Ollama:
```
ollama serve
```
Runs at http://localhost:11434.
Pull Models:
- Embeddings:
```
ollama pull nomic-embed-text
```
- Language model:
```
ollama pull deepseek-r1:1.5b
```
  Note: If deepseek-r1:1.5b isn’t in Ollama’s registry, confirm its exact name or load it via a custom Modelfile.

Step 2: Generate and Store Embeddings

Create generate_embeddings.py:

import ollama
import chromadb

# Exercise science text corpus
documents = [
    "Strength training involves resistance exercises designed to increase muscle mass, improve muscular endurance, and enhance overall physical performance. Common methods include weightlifting with barbells or dumbbells, bodyweight exercises like push-ups, and resistance band workouts.",
    "Cardiovascular exercise, such as running, cycling, or swimming, boosts heart health by improving circulation, increasing lung capacity, and reducing the risk of chronic diseases like hypertension and diabetes.",
    "Stretching and mobility exercises, including yoga and dynamic warm-ups, enhance joint range of motion, reduce injury risk, and improve posture by counteracting the stiffness caused by sedentary lifestyles.",
    "Post-exercise recovery is critical for performance gains. Techniques like foam rolling, adequate sleep, and proper nutrition—especially protein intake—help repair muscle fibers and reduce soreness.",
    "High-Intensity Interval Training (HIIT) alternates short bursts of intense exercise with rest periods, maximizing calorie burn and improving aerobic capacity in less time than traditional steady-state cardio."
]

# Initialize Chroma client
client = chromadb.PersistentClient(path="./exercise_db")

# Create or get a collection
collection = client.get_or_create_collection(name="exercise_science")

# Generate and store embeddings
for i, doc in enumerate(documents):
    prefixed_doc = f"search_document: {doc}"
    response = ollama.embed(model="nomic-embed-text", input=prefixed_doc)
    embedding = response["embeddings"][0]
    
    collection.add(
        ids=[str(i)],
        embeddings=[embedding],
        documents=[doc]
    )

print("Embeddings generated and stored in Chroma!")

Run it:

python generate_embeddings.py

Output: "Embeddings generated and stored in Chroma!"

Step 3: Query and Generate Answers (Without `<think>` Tags)

Create query_embeddings.py with filtering for the <think> section:

import ollama
import chromadb

# Initialize Chroma client
client = chromadb.PersistentClient(path="./exercise_db")

# Get the existing collection
collection = client.get_collection(name="exercise_science")

# Query example
query = "What exercises improve heart health?"
prefixed_query = f"search_query: {query}"
query_response = ollama.embed(model="nomic-embed-text", input=prefixed_query)
query_embedding = query_response["embeddings"][0]

# Search for the top match
results = collection.query(
    query_embeddings=[query_embedding],
    n_results=1
)

# Retrieve the top document
retrieved_doc = results["documents"][0][0]

# Generate answer with deepseek-r1:1.5b
prompt = f"Using this info: '{retrieved_doc}', answer: {query}"
answer = ollama.generate(model="deepseek-r1:1.5b", prompt=prompt)

# Remove <think> section from the response
response_text = answer["response"]
start_tag = "<think>"
end_tag = "</think>"
if start_tag in response_text and end_tag in response_text:
    start_idx = response_text.index(start_tag)
    end_idx = response_text.index(end_tag) + len(end_tag)
    response_text = response_text[:start_idx] + response_text[end_idx:]
response_text = response_text.strip()  # Clean up any extra whitespace

# Print results
print("\nQuery:", query)
print("Retrieved document:", retrieved_doc)
print("Generated answer:", response_text)

Run it:

python query_embeddings.py

Expected Output (based on your sample response, without <think>):

Query: What exercises improve heart health?
Retrieved document: Cardiovascular exercise, such as running, cycling, or swimming, boosts heart health by improving circulation, increasing lung capacity, and reducing the risk of chronic diseases like hypertension and diabetes.
Generated answer: The provided information highlights several cardiovascular exercises that improve heart health by enhancing circulation, lung capacity, and reducing the risk of chronic diseases like hypertension and diabetes. Here is a comprehensive list of exercises that support cardiovascular health:

1. **Running**: Enhances circulation, improves lung capacity, and contributes to overall cardiovascular function.
2. **Cycling**: Improves blood flow, increases lung capacity, and reduces the likelihood of heart disease and related conditions.
3. **Swimming**: Boosts cardiovascular efficiency, enhances lung expansion, and mitigates the risk of chronic diseases.
4. **Yoga and Tai Chi**: These exercises target multiple bodies, improving circulation, strength, flexibility, and overall health without relying solely on exercise alone.
5. **Freestyle Swimming**: Further increases heart rate and blood flow through efficient stroke technique.
6. **Push-Ups**: While primarily a strength training exercise, they can improve cardiovascular fitness by enhancing circulation.

These exercises collectively support heart health by addressing circulatory and lung-related systems.

Step 4: Why Chroma?

Reason: Chroma is lightweight, open-source (Apache 2.0), and local, storing embeddings in ./exercise_db.
Pros: Simple, no cloud needed, integrates with Ollama.
Alternatives: FAISS (faster for scale) or pgvector (SQL-based), but Chroma suits this setup.

Step 5: Extend (Optional)

More Results: Set n_results=3, combine documents:

retrieved_docs = " ".join(results["documents"][0])
prompt = f"Using this info: '{retrieved_docs}', answer: {query}"

Prompt Tuning: Adjust prompt for concise answers, e.g., "Summarize how '{retrieved_doc}' answers: {query}."
Custom Queries: Edit query and rerun.

Troubleshooting

Model Check: Run ollama list to confirm deepseek-r1:1.5b is available. If not, verify its name or load it manually.
Ollama Running: Ensure ollama serve is active.
Response Format: If <think> tags vary, adjust the filtering logic (e.g., use regex).
Empty Collection: Add print(collection.count()). If 0, rerun generate_embeddings.py.

Notes

<think> Removal: The script assumes <think> and </think> wrap the reasoning. If the format changes, you might need a more robust parser (e.g., re.sub(r'<think>.*?</think>', '', response_text, flags=re.DOTALL) with import re).
Deepseek Behavior: deepseek-r1:1.5b includes reasoning steps. If this persists unwantedly, you could tweak the prompt to discourage it (e.g., "Answer directly without reasoning: ...").

Name	Name	Last commit message	Last commit date
Latest commit Ian Philpot Project init Mar 6, 2025 8714941 · Mar 6, 2025 History 1 Commit
.gitignore	.gitignore	Project init	Mar 6, 2025
.python-version	.python-version	Project init	Mar 6, 2025
README.md	README.md	Project init	Mar 6, 2025
generate_embeddings.py	generate_embeddings.py	Project init	Mar 6, 2025
pyproject.toml	pyproject.toml	Project init	Mar 6, 2025
query_embeddings.py	query_embeddings.py	Project init	Mar 6, 2025
uv.lock	uv.lock	Project init	Mar 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Local RAG

Tutorial: Using `nomic-embed-text` with Ollama, Chroma, and `deepseek-r1:1.5b` Locally

Overview

Sample Exercise Science Text

Prerequisites

Step 1: Set Up Ollama

Step 2: Generate and Store Embeddings

Step 3: Query and Generate Answers (Without `<think>` Tags)

Step 4: Why Chroma?

Step 5: Extend (Optional)

Troubleshooting

Notes

About

Releases

Packages

Languages

ianphil/local_rag

Folders and files

Latest commit

History

Repository files navigation

Local RAG

Tutorial: Using nomic-embed-text with Ollama, Chroma, and deepseek-r1:1.5b Locally

Overview

Sample Exercise Science Text

Prerequisites

Step 1: Set Up Ollama

Step 2: Generate and Store Embeddings

Step 3: Query and Generate Answers (Without <think> Tags)

Step 4: Why Chroma?

Step 5: Extend (Optional)

Troubleshooting

Notes

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Tutorial: Using `nomic-embed-text` with Ollama, Chroma, and `deepseek-r1:1.5b` Locally

Step 3: Query and Generate Answers (Without `<think>` Tags)

Packages