Skip to content

Commit

Permalink
Merge branch 'json-embed-recursive'
Browse files Browse the repository at this point in the history
  • Loading branch information
Florents-Tselai committed Aug 15, 2024
2 parents 3b5017f + de30285 commit 89646dc
Showing 1 changed file with 41 additions and 3 deletions.
44 changes: 41 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,6 @@
[![codecov](https://codecov.io/gh/Florents-Tselai/tsellm/branch/main/graph/badge.svg)](https://codecov.io/gh/Florents-Tselai/tsellm)
[![License](https://img.shields.io/badge/BSD%20license-blue.svg)](https://github.com/Florents-Tselai/tsellm/blob/main/LICENSE)




**tsellm** is the easiest way to access LLMs from SQLite or DuckDB.

```shell
Expand Down Expand Up @@ -44,12 +41,53 @@ so you can use any of its plugins:
```shell
llm install llm-sentence-transformers
llm sentence-transformers register all-MiniLM-L12-v2
llm install llm-embed-hazo # dummy embedding model for demonstration purposes
```

```sql
tsellm prompts.sqlite3 "select embed(p, 'sentence-transformers/all-MiniLM-L12-v2')"
```

### Embedding `JSON` Recursively

If you have `JSON` columns, you can embed these object recursively.
That is, an embedding vector of floats will replace each text occurrence in the object.

```bash
cat <<EOF | tee >(sqlite3 prompts.sqlite3) | duckdb prompts.duckdb
CREATE TABLE people(d JSON);
INSERT INTO people (d) VALUES
('{"name": "John Doe", "age": 30, "hobbies": ["reading", "biking"]}'),
('{"name": "Jane Smith", "age": 25, "hobbies": ["painting", "traveling"]}')
EOF
```

#### SQLite

```sql
tsellm prompts.sqlite3 "select json_embed(d, 'hazo') from people"
```

*Output*

```
('{"name": [4.0, 3.0,..., 0.0], "age": 30, "hobbies": [[7.0, 0.0,..., 0.0], [6.0, 0.0, ..., 0.0]]}',)
('{"name": [4.0, 5.0, ,..., 0.0], "age": 25, "hobbies": [[8.0, 0.0,..., 0.0], [9.0, 0.0,..., 0.0]]}',)
```

#### DuckDB

```sql
tsellm prompts.duckdb "select json_embed(d, 'hazo') from people"
```

*Output*

```
('{"name": [4.0, 3.0,..., 0.0], "age": 30, "hobbies": [[7.0, 0.0,..., 0.0], [6.0, 0.0, ..., 0.0]]}',)
('{"name": [4.0, 5.0, ,..., 0.0], "age": 25, "hobbies": [[8.0, 0.0,..., 0.0], [9.0, 0.0,..., 0.0]]}',)
```

### Embeddings for binary (`BLOB`) columns

```shell
Expand Down

0 comments on commit 89646dc

Please sign in to comment.