Skip to content

Commit

Permalink
Shanbady/qdrant docs (#1932)
Browse files Browse the repository at this point in the history
* adding initial doc

* adding initial doc

* adding note about ollama usage

* re-organizing sections. cleanup

* re-organizing sections. cleanup
  • Loading branch information
shanbady authored Jan 2, 2025
1 parent b07587d commit 7ff0b29
Showing 1 changed file with 74 additions and 0 deletions.
74 changes: 74 additions & 0 deletions docs/how-to/embeddings.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
---
parent: How-To
nav_order: 1
---

# Generating Embeddings

Embeddings are used for features that rely on vector similarity such as the vector search endpoint and the similar resources carousel. They are generated and stored in our vector database - [Qdrant](https://qdrant.tech/). There are celery tasks that auto-generate embeddings for new content in addition to management commands to manually generate and store embeddings.

## Settings

The following embeddings related settings are available in the `settings.py` file:

| Setting | Desription |
| --------------------------- | ------------------------------------------------------------------------------------------------------------------ |
| QDRANT_API_KEY | Description |
| QDRANT_HOST | Qdrant Host address |
| QDRANT_BASE_COLLECTION_NAME | Collection name prefix to use |
| QDRANT_DENSE_MODEL | Main embedding model to use |
| QDRANT_SPARSE_MODEL | Sparse model to use for embeddings |
| QDRANT_CHUNK_SIZE | Chunk size to use when batching embedding tasks |
| QDRANT_ENCODER | Encoding class to use for embeddings in `vector_search/encoders` |
| LITELLM_TOKEN_ENCODING_NAME | token encoder to use when chunking or calculating token sizes. Defaults to model returned by tiktoken If undefined |
| LITELLM_CUSTOM_PROVIDER | Set to 'ollama' by default. Only takes effect when LITELLM_API_BASE is also defined |
| LITELLM_API_BASE | Required to enable ollama as the litellm provider |
| OPENAI_API_KEY | Required if using "LiteLLMEncoder" with openai as a provider (default) |

## Embeddings for New Content

Embeddings are automatically generated for new conetnt by a periodic celery task. The tasks are defined `vector_search/tasks.py`.

## Embeddings for Existing Content

The following management command is available from the vector_search app to generate and store embeddings:

```bash
python manage.py generate_embeddings --help
```

The command above generates embeddings for all content files and learning resources in the database and stores them in Qdrant. The embeddings are generated in batches to prevent memory issues. The command also supports the `--recreate-collections` flag which will delete all existing embeddings and collections before re-generating new embeddings. There is also a flag for skipping contentfiles in case we only need to regenerate learning resources.

## Ollama for (Hardware Accelerated) Embedding

[Ollama](https://ollama.com/) is an easy way to run LLMs and embedding models locally while also taking advantage of any available hardware acceleration.

To get setup:

1. Install [ollama](https://ollama.com/)
2. Open up a terminal and pull down the embedding model you would like to use via ollama pull <model name>. [Supported models](https://ollama.com/blog/embedding-models)
3. Bring down your celery and web container docker compose down celery web
4. Configure litellm to use ollama with the following settings in your .env file:

```
QDRANT_ENCODER=vector_search.encoders.litellm.LiteLLMEncoder
LITELLM_API_BASE=http://docker.for.mac.host.internal:11434
QDRANT_DENSE_MODEL=<ollama model name>
```

_Note_ - "LITELLM_API_BASE=http://docker.for.mac.host.internal:11434" is Mac specific - if you are using another OS you will need to figure out what your host machine's docker address is.

Sample .env file configuration on Mac:

```
QDRANT_ENCODER=vector_search.encoders.litellm.LiteLLMEncoder
LITELLM_API_BASE=http://docker.for.mac.host.internal:11434
QDRANT_DENSE_MODEL=all-minilm
```

5. Bring the web and celery containers back up via docker compose up -d
6. Run embeddings via:

```bash
python manage.py generate_embeddings --all --recreate-collections --skip-contentfiles
```

0 comments on commit 7ff0b29

Please sign in to comment.