Skip to content

Commit

Permalink
docs: caching in ragas (#1779)
Browse files Browse the repository at this point in the history
  • Loading branch information
jjmachan authored Dec 24, 2024
1 parent e8f9232 commit 9403320
Show file tree
Hide file tree
Showing 7 changed files with 392 additions and 5 deletions.
9 changes: 5 additions & 4 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -36,10 +36,11 @@ test-e2e: ## Run end2end tests
run-ci: format lint type test ## Running all CI checks

# Docs
rewrite-docs: ## Use GPT4 to rewrite the documentation
@echo "Rewriting the documentation in directory $(DIR)..."
$(Q)python $(GIT_ROOT)/docs/python alphred.py --directory $(DIR)
docsite: ## Build and serve documentation
build-docsite: ## Use GPT4 to rewrite the documentation
@echo "convert ipynb notebooks to md files"
$(Q)python $(GIT_ROOT)/docs/ipynb_to_md.py
@(Q)mkdocs build
serve-docsite: ## Build and serve documentation
$(Q)mkdocs serve --dirtyreload

# Benchmarks
Expand Down
100 changes: 100 additions & 0 deletions docs/howtos/customizations/_caching.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
# Caching in Ragas

You can use caching to speed up your evaluations and testset generation by avoiding redundant computations. We use Exact Match Caching to cache the responses from the LLM and Embedding models.

You can use the [DiskCacheBackend][ragas.cache.DiskCacheBackend] which uses a local disk cache to store the cached responses. You can also implement your own custom cacher by implementing the [CacheInterface][ragas.cache.CacheInterface].


## Using DefaultCacher

Let's see how you can use the [DiskCacheBackend][ragas.cache.DiskCacheBackend] LLM and Embedding models.



```python
from ragas.cache import DiskCacheBackend

cacher = DiskCacheBackend()

# check if the cache is empty and clear it
print(len(cacher.cache))
cacher.cache.clear()
print(len(cacher.cache))
```




DiskCacheBackend(cache_dir=.cache)



Create an LLM and Embedding model with the cacher, here I'm using the `ChatOpenAI` from [langchain-openai](https://github.com/langchain-ai/langchain-openai) as an example.



```python
from langchain_openai import ChatOpenAI
from ragas.llms import LangchainLLMWrapper

cached_llm = LangchainLLMWrapper(ChatOpenAI(model="gpt-4o"), cache=cacher)
```


```python
# if you want to see the cache in action, set the logging level to debug
import logging
from ragas.utils import set_logging_level

set_logging_level("ragas.cache", logging.DEBUG)
```

Now let's run a simple evaluation.


```python
from ragas import evaluate
from ragas import EvaluationDataset

from ragas.metrics import FactualCorrectness, AspectCritic
from datasets import load_dataset

# Define Answer Correctness with AspectCritic
answer_correctness = AspectCritic(
name="answer_correctness",
definition="Is the answer correct? Does it match the reference answer?",
llm=cached_llm,
)

metrics = [answer_correctness, FactualCorrectness(llm=cached_llm)]

# load the dataset
dataset = load_dataset(
"explodinggradients/amnesty_qa", "english_v3", trust_remote_code=True
)
eval_dataset = EvaluationDataset.from_hf_dataset(dataset["eval"])

# evaluate the dataset
results = evaluate(
dataset=eval_dataset,
metrics=metrics,
)

results
```

This took almost 2mins to run in our local machine. Now let's run it again to see the cache in action.


```python
results = evaluate(
dataset=eval_dataset,
metrics=metrics,
)

results
```

Runs almost instantaneously.

You can also use this with testset generation also by replacing the `generator_llm` with a cached version of it. Refer to the [testset generation](../../getstarted/rag_testset_generation.md) section for more details.
173 changes: 173 additions & 0 deletions docs/howtos/customizations/caching.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,173 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Caching in Ragas\n",
"\n",
"You can use caching to speed up your evaluations and testset generation by avoiding redundant computations. We use Exact Match Caching to cache the responses from the LLM and Embedding models.\n",
"\n",
"You can use the [DiskCacheBackend][ragas.cache.DiskCacheBackend] which uses a local disk cache to store the cached responses. You can also implement your own custom cacher by implementing the [CacheInterface][ragas.cache.CacheInterface].\n",
"\n",
"\n",
"## Using DefaultCacher\n",
"\n",
"Let's see how you can use the [DiskCacheBackend][ragas.cache.DiskCacheBackend] LLM and Embedding models.\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"DiskCacheBackend(cache_dir=.cache)"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from ragas.cache import DiskCacheBackend\n",
"\n",
"cacher = DiskCacheBackend()\n",
"\n",
"# check if the cache is empty and clear it\n",
"print(len(cacher.cache))\n",
"cacher.cache.clear()\n",
"print(len(cacher.cache))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Create an LLM and Embedding model with the cacher, here I'm using the `ChatOpenAI` from [langchain-openai](https://github.com/langchain-ai/langchain-openai) as an example.\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"from langchain_openai import ChatOpenAI\n",
"from ragas.llms import LangchainLLMWrapper\n",
"\n",
"cached_llm = LangchainLLMWrapper(ChatOpenAI(model=\"gpt-4o\"), cache=cacher)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# if you want to see the cache in action, set the logging level to debug\n",
"import logging\n",
"from ragas.utils import set_logging_level\n",
"\n",
"set_logging_level(\"ragas.cache\", logging.DEBUG)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now let's run a simple evaluation."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from ragas import evaluate\n",
"from ragas import EvaluationDataset\n",
"\n",
"from ragas.metrics import FactualCorrectness, AspectCritic\n",
"from datasets import load_dataset\n",
"\n",
"# Define Answer Correctness with AspectCritic\n",
"answer_correctness = AspectCritic(\n",
" name=\"answer_correctness\",\n",
" definition=\"Is the answer correct? Does it match the reference answer?\",\n",
" llm=cached_llm,\n",
")\n",
"\n",
"metrics = [answer_correctness, FactualCorrectness(llm=cached_llm)]\n",
"\n",
"# load the dataset\n",
"dataset = load_dataset(\n",
" \"explodinggradients/amnesty_qa\", \"english_v3\", trust_remote_code=True\n",
")\n",
"eval_dataset = EvaluationDataset.from_hf_dataset(dataset[\"eval\"])\n",
"\n",
"# evaluate the dataset\n",
"results = evaluate(\n",
" dataset=eval_dataset,\n",
" metrics=metrics,\n",
")\n",
"\n",
"results"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This took almost 2mins to run in our local machine. Now let's run it again to see the cache in action."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"results = evaluate(\n",
" dataset=eval_dataset,\n",
" metrics=metrics,\n",
")\n",
"\n",
"results"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Runs almost instantaneously.\n",
"\n",
"You can also use this with testset generation also by replacing the `generator_llm` with a cached version of it. Refer to the [testset generation](../../getstarted/rag_testset_generation.md) section for more details."
]
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.15"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
3 changes: 3 additions & 0 deletions docs/references/cache.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
::: ragas.cache
options:
members_order: "source"
4 changes: 4 additions & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@ nav:
- General:
- Customise models: howtos/customizations/customize_models.md
- Run Config: howtos/customizations/_run_config.md
- Caching: howtos/customizations/_caching.md
- Metrics:
- Modify Prompts: howtos/customizations/metrics/_modifying-prompts-metrics.md
- Adapt Metrics to Languages: howtos/customizations/metrics/_metrics_language_adaptation.md
Expand All @@ -88,6 +89,7 @@ nav:
- Persona Generation: howtos/customizations/testgenerator/_persona_generator.md
- Custom Single-hop Query: howtos/customizations/testgenerator/_testgen-custom-single-hop.md
- Custom Multi-hop Query: howtos/customizations/testgenerator/_testgen-customisation.md

- Applications:
- howtos/applications/index.md
- Metrics:
Expand All @@ -107,6 +109,7 @@ nav:
- Embeddings: references/embeddings.md
- RunConfig: references/run_config.md
- Executor: references/executor.md
- Cache: references/cache.md
- Evaluation:
- Schemas: references/evaluation_schema.md
- Metrics: references/metrics.md
Expand Down Expand Up @@ -237,3 +240,4 @@ extra_javascript:
- _static/js/header_border.js
- https://unpkg.com/mathjax@3/es5/tex-mml-chtml.js
- _static/js/toggle.js
- https://cdn.octolane.com/tag.js?pk=c7c9b2b863bf7eaf4e2a # octolane for analytics
Loading

0 comments on commit 9403320

Please sign in to comment.