-
Notifications
You must be signed in to change notification settings - Fork 2
/
data.yaml
398 lines (307 loc) · 15.7 KB
/
data.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
header: |
# sqlite-ecosystem
All the SQLite extensions, tools, and guides that I (Alex Garcia 👋🏼) have written and open sourced.
Feel free to start a discussion in this repo about meta-level SQLite extension and tooling talk! Though if you're having a specific issue with a particular extension or tool, then file an issue in that project's repository.
## Overview
These are all the loadable SQLite extensions I have built, along with which programming language they are written in and where they are distributed.
| Extension | Description | Language | Python | Node.js | Deno | Ruby | Datasette | sqlite-utils |
| ----------------------------------------- | -------------------------------------- | -------- | :--: | :--: | :---: | :--: | :-------: | :-----: |
| [`sqlite-http`](#sqlite-http) | Make HTTP requests | Go | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| [`sqlite-html`](#sqlite-html) | parse HTML documents | Go | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| [`sqlite-lines`](#sqlite-lines) | Read files/blob line-by-line | C | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| [`sqlite-path`](#sqlite-path) | Parse and generate filepaths | C | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| [`sqlite-url`](#sqlite-url) | Parse and generate URLs | C | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| [`sqlite-xsv`](#sqlite-xsv) | Query CSVs | Rust | ✅ | ✅ | ✅ | ✅ | 🚧 | 🚧 |
| [`sqlite-regex`](#sqlite-regex) | Regular expression functions | Rust | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| [`sqlite-ulid`](#sqlite-ulid) | Work with ULIDs | Rust | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| [`sqlite-jsonschema`](#sqlite-jsonschema) | Validate JSON objects with JSON Schema | Rust | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| [`sqlite-fastrand`](#sqlite-fastrand) | Generate fast numbers/blobs quickly | Rust | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| [`sqlite-vss`](#sqlite-vss) | Vector search in SQLite | C++ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
extensions:
- repo: sqlite-http
lang: Go
platforms: [linux-x86_64, macos-x86_64, macos-aarch64, windows-x86_64]
short: Make HTTP requests
long: |
A SQLite extension for making HTTP requests purely in SQL.
- Create GET, POST, and other HTTP requests, like curl, wget, and fetch
- Download response bodies, header, status codes, timing info
- Set rate limits, timeouts
```sql
select http_get_body('https://text.npr.org/');
select
http_get_body('https://api.github.com/repos/sqlite/sqlite') ->> '$.description';
```
- repo: sqlite-html
lang: Go
platforms: [linux-x86_64, macos-x86_64, macos-aarch64, windows-x86_64]
short: parse HTML documents
long: |
A SQLite extension for querying, manipulating, and creating HTML elements.
- Extract HTML or text from HTML with CSS selectors, like `.querySelector()`, `.innerHTML`, and `.innerText`
- Generate a table of matching elements from a CSS selector, like `.querySelectorAll()`
- Safely create HTML elements in a query, like .createElement() and `.appendChild()`
```sql
select html_extract('<p> Anakin <b>Skywalker</b> </p>', 'b'); -- "<b>Skywalker</b>"
```
- repo: sqlite-lines
lang: C
platforms: [linux-x86_64, macos-x86_64, macos-aarch64]
short: Read files/blob line-by-line
long: |
A SQLite extension for efficiently reading large files or blobs line-by-line.
```sql
select line from lines_read('logs.txt');
```
- repo: sqlite-path
lang: C
platforms: [linux-x86_64, macos-x86_64, macos-aarch64, windows-x86_64]
short: Parse and generate filepaths
long: |
A SQLite extension for parsing, generating, and querying paths.
```sql
select path_dirname('foo/bar.txt'); -- 'foo/'
select path_basename('foo/bar.txt'); -- 'bar.txt'
select path_extension('foo/bar.txt'); -- '.txt'
```
- repo: sqlite-url
lang: C
platforms: [linux-x86_64, macos-x86_64]
short: Parse and generate URLs
long: |
A SQLite extension for parsing, generating, and querying URLs and query strings
- Extract specific parts of a URL, like schemes, hostnames, and paths
- Generate URLs with parts safely
- Iterate through all key and values in a query string
```sql
select url_valid('https://sqlite.org'); -- 1
select url_scheme('https://www.sqlite.org/vtab.html#usage'); -- 'https'
select url_host('https://www.sqlite.org/vtab.html#usage'); -- 'www.sqlite.org'
```
- repo: sqlite-xsv
lang: Rust
platforms: [linux-x86_64, macos-x86_64, macos-aarch64, windows-x86_64]
no_datasette: true
no_sqlite_utils: true
short: Query CSVs
long: |
A fast and performant SQLite extension for CSV files, written in Rust!
- Query CSVs, TSVs, and other-SVs as SQLite virtual tables
- The "reader" interface lets you query CSVs from other data sources, such as [`sqlite-http`](#sqlite-http)
- Builtin support for querying CSVs with gzip or zstd compression
```sql
create virtual table temp.students using csv(
filename="students.csv"
);
select * from temp.students;
create virtual table temp.students_gz using csv(
filename="students.csv.gz"
);
create virtual table temp.students_reader using csv_reader(
id integer,
name text,
age integer,
progess real
);
select * from temp.students_reader('./target/students.csv');
```
- repo: sqlite-regex
lang: Rust
platforms: [linux-x86_64, macos-x86_64, macos-aarch64, windows-x86_64]
short: Regular expression functions
long: |
A fast SQLite extension for regular expressions.
```sql
select regex_find(
'[0-9]{3}-[0-9]{3}-[0-9]{4}',
'phone: 111-222-3333'
);
select rowid, *
from regex_find_all(
'\b\w{13}\b',
'Retroactively relinquishing remunerations is reprehensible.'
);
/*
┌───────┬───────┬─────┬───────────────┐
│ rowid │ start │ end │ match │
├───────┼───────┼─────┼───────────────┤
│ 0 │ 0 │ 13 │ Retroactively │
│ 1 │ 14 │ 27 │ relinquishing │
│ 2 │ 28 │ 41 │ remunerations │
│ 3 │ 45 │ 58 │ reprehensible │
└───────┴───────┴─────┴───────────────┘
*/
```
- repo: sqlite-ulid
lang: Rust
platforms: [linux-x86_64, macos-x86_64, macos-aarch64, windows-x86_64]
short: Work with ULIDs
long: |
A SQLite extension for generating and working with [ULIDs](https://github.com/ulid/spec).
```sql
select ulid(); -- '01gqr4j69cc7w1xdbarkcbpq17'
select ulid_bytes(); -- X'0185310899dd7662b8f1e5adf9a5e7c0'
select ulid_with_prefix('invoice'); -- 'invoice_01gqr4jmhxhc92x1kqkpxb8j16'
select ulid_datetime('01gqr4j69cc7w1xdbarkcbpq17') -- '2023-01-26 22:53:20.556'
```
- repo: sqlite-jsonschema
lang: Rust
platforms: [linux-x86_64, macos-x86_64, macos-aarch64, windows-x86_64]
short: Validate JSON objects with JSON Schema
long: |
A SQLite extension for validating JSON objects with [JSON Schema](https://json-schema.org/).
```sql
select jsonschema_matches('{"maxLength": 5}', json_quote('alex')); -- 1
```
- repo: sqlite-fastrand
lang: Rust
platforms: [linux-x86_64, macos-x86_64, macos-aarch64, windows-x86_64]
short: Generate fast numbers/blobs quickly
long: |
A SQLite extension for quickly generating random numbers, booleans, characters, and blobs
```sql
select fastrand_int(); -- 556823563
select fastrand_alphabetic(); -- 's'
select fastrand_uppercase();-- 'M'
select fastrand_double(); -- 0.740834390248454
```
- repo: sqlite-vss
lang: C++
platforms: [linux-x86_64, macos-x86_64, macos-aarch64]
short: Vector search in SQLite
long: |
A SQLite extension for efficient vector search, based on Faiss!
```sql
create virtual table vss_articles using vss0(
headline_embedding(384),
description_embedding(384),
);
insert into vss_articles(rowid, headline_embedding)
select rowid, headline_embedding from articles;
select rowid, distance
from vss_articles
where vss_search(
headline_embedding,
(select headline_embedding from articles where rowid = 123)
)
limit 100;
```
- repo: sqlite-hello
lang: C
platforms: [linux-x86_64, macos-x86_64, macos-aarch64, windows-x86_64]
short: \"Hello world\" extension
long: |
The smallest "Hello world" SQLite extension possible.
```sql
sqlite> .load ./hello0
sqlite> select hello('Alex');
'Hello, Alex!'
```
footer: |
### Distribution
SQLite extensions have historically been hard to shared with other people. Installing and using them can be tricky and confusing, but these are some strategies I've used to make that easier!
### `pip` for Python Developers
Most of my SQLite extensions are additional placed in a small Python library and distributed on [PyPi](https://pypi.org/). Python developers can then install these extensions with a `pip install` command like so:
```bash
pip install sqlite-regex
```
The actual Python library is small and meant to be used with the builtin `sqlite3` Python module, like so:
```python
import sqlite3
import sqlite_regex
db = sqlite3.connect(':memory:')
db.enable_load_extension(True)
sqlite_regex.load(db)
db.execute("select regex_version(), '[abc]' regexp 'a';").fetchone()
# ('v0.1.0', 1)
```
See [_Making SQLite extensions pip install-able
_](https://observablehq.com/@asg017/making-sqlite-extensions-pip-install-able) (February 2023) for more details.
### `npm` for Node.js Developers
These extensions are also distributed on [`npm`](https://npmjs.com/) for use in Node.js. they can be install with a `npm install` command like so:
```bash
npm install sqlite-regex
```
And then used in Node.js programs with [`better-sqlite3`](https://github.com/WiseLibs/better-sqlite3) or [`node-sqlite3`](https://github.com/TryGhost/node-sqlite3) like so:
```js
import Database from "better-sqlite3";
import * as sqlite_regex from "sqlite-regex";
const db = new Database(":memory:");
db.loadExtension(sqlite_regex.getLoadablePath());
const version = db.prepare("select regex_version()").pluck().get();
console.log(version); // "v0.2.0"
```
See [_Making SQLite extensions npm install'able for Node.js, and on deno.land/x for Deno_](https://observablehq.com/@asg017/making-sqlite-extensions-npm-installable-and-deno) (March 2023) for more details.
### `deno.land/x` for Deno Developers
These extensions are also distributed on [`deno.land/x`](https://deno.land/x) for use in [Deno](https://deno.land/) programs. It's meant to work with the [`x/sqlite3`](https://deno.land/x/sqlite3) client like so:
```ts
import { Database } from "https://deno.land/x/[email protected]/mod.ts";
import * as sqlite_regex from "https://deno.land/x/[email protected]/mod.ts";
const db = new Database(":memory:");
db.enableLoadExtension = true;
sqlite_regex.load(db);
const [version] = db.prepare("select regex_version()").value<[string]>()!;
console.log(version);
```
See [the "Deno" section of _Making SQLite extensions npm install'able for Node.js, and on deno.land/x for Deno_](https://observablehq.com/@asg017/making-sqlite-extensions-npm-installable-and-deno#cell-13) (March 2023) for more details.
### `gem` for ruby Developers
These extensions are also distributed on [rubygems.org](https://rubygems.org/) for use in Ruby applications. It's meant to work with the [`sqlite3`](https://rubygems.org/gems/sqlite3) client like so:
```ruby
require 'sqlite3'
require 'sqlite_regex'
db = SQLite3::Database.new(':memory:')
db.enable_load_extension(true)
SqliteRegex.load(db)
db.enable_load_extension(false)
result = db.execute("select regex_version(), '[abc]' regexp 'a';")
puts result.first.first # "v0.2.3-alpha.7"
puts result.first.last # 1
```
See [_Making SQLite extension gem install'able for Ruby Developers_](https://observablehq.com/@asg017/making-sqlite-extension-gem-installable) (June 2023) for more details.
### As Datasette Plugins
Most of these SQLite extensions are also distributed as [Datasette Plugins](https://datasette.io/plugins). They are small wrappers around their corresponding [pip packages](#pip-for-python-developers).
They can be installed like so:
```bash
datasette install datasette-sqlite-regex
```
Now all future Datasette instances will include `sqlite-regex` functions. Here we test it using [`datasette --get`](https://docs.datasette.io/en/stable/cli-reference.html#datasette-get).
```
$ datasette --get '/_memory.csv?sql=select+regex_version()'
regex_version()
v0.1.0
```
### As sqlite-utils Plugins
Most of these SQLite extensions are also distributed as [sqlite-utils plugins](https://sqlite-utils.datasette.io/en/stable/plugins.html). They are small wrappers around their corresponding [pip packages](#pip-for-python-developers).
They can be installed like so:
```bash
sqlite-utils install sqlite-utils-sqlite-regex
```
Now when using the sqlite-utils CLI, the `sqlite-regex` functions will be available.
```
$ sqlite-utils memory 'select regex_version()'
[{"regex_version()": "v0.2.3"}]
```
## Roadmap
Various extensions and tools that I plan to build and open source in the future! Most of the development of these happen in private repos, to make my life easier.
If you're interested in any of these, just send me a message!
### Future SQLite Extensions
- `sqlite-img`: Query and manipulate images (cropping, thumbnails, rotation, etc.)
- `sqlite-xml`: Query XML documents with XPath strings
- `sqlite-assert`: Make assertions of your data at query-time
- `sqlite-parquet`: Query Parquet files from SQLite
- `sqlite-md`: Query Markdown documents as their AST
- `sqlite-geo`: A GIS extension, a slimmed-down alternative to spatialite
- `sqlite-python`: Write loadable SQLite extensions in pure Python
- `sqlite-qjs`: Create custom scalar, aggregate, and table functions in JavaScript with [QuickJS](https://bellard.org/quickjs/)
- `sqlite-duckdb`: Query and insert data in DuckDB tables, query CSVs/Parquet/JSON with DuckDB
- `sqlite-pg`: Query and insert data into Postgres tables
- `sqlite-notion`: Query and insert data into Notion databases
- `sqlite-google-sheets`: query and insert data into Google Sheet
### Future Extension Bindings
- Elixir
- Rust
- Go
### Future Tooling
- `sqlite-package-manager`: A command-line tool for manage SQLite extensions for your project, similar to npm/pip/brew
- `sqlite-docs`: Document SQLite tables and columns with in-line comments on `CREATE TABLE` statements, like JSDoc/rustdoc/docstrings
- `sqlitex`: A modern `sqlite3` alternative, a new CLI with Parquet/S3 support, a bigger stdlib, syntax highlighting, and more