Skip to content

Commit

Permalink
Small update library.md - add an encoder to the KB initialisation as …
Browse files Browse the repository at this point in the history
…an example (#302)

* Update library.md

* Update library.md
  • Loading branch information
coreation authored Feb 29, 2024
1 parent 050869d commit fa82c84
Showing 1 changed file with 11 additions and 1 deletion.
12 changes: 11 additions & 1 deletion docs/library.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@ You can create your own customized tokenizer by implementing a new class that de
from canopy.tokenizer import Tokenizer, BaseTokenizer

class CustomTokenizer(BaseTokenizer):
# Implement BaseToknizer's abstract methods, like `def tokenize()` etc.
# Implement BaseTokenizer's abstract methods, like `def tokenize()` etc.
# ....

Tokenizer.initialize(tokenizer_class=CustomTokenizer)
Expand Down Expand Up @@ -124,6 +124,16 @@ from canopy.knowledge_base import KnowledgeBase
kb = KnowledgeBase(index_name="my-index")
```

Alternatively, you can create a knowledge base with a specified record encoder:

```python
from canopy.knowledge_base.record_encoder import OpenAIRecordEncoder

encoder = OpenAIRecordEncoder(model_name="text-embedding-3-small")

kb = KnowledgeBase(index_name="my-index", record_encoder=encoder)
```

To create a new Pinecone index and connect it to the knowledge base, you can use the `create_canopy_index` method:

```python
Expand Down

0 comments on commit fa82c84

Please sign in to comment.