Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Table types #4311

Merged
merged 4 commits into from
Dec 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
3 changes: 2 additions & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,8 @@ Some of this style guide is automated with GitHub Actions, but feel free to run
* Always format SQL code, variable names, function names, etc. as code. For example, when talking about the `CREATE TABLE` statement, the keywords should be formatted as code.
* When presenting SQL statements, do not include the DuckDB prompt (`D `).
* SQL statements should end with a semicolon (`;`) to allow readers to quickly paste them into a SQL console.
* Narrow tables – that do not span horizontally across the entire page – should be prepended with an empty div that has the `narrow_table` class: `<div class="narrow_table"></div>`.
* Tables with predominantly code output (e.g., the result of a `DESCRIBE` statement) should be prepended with an empty div that has the `monospace_table` class: `<div class="monospace_table"></div>`.
* Tables where the headers should be center-aligned (opposed to the left-aligned default) should be prepended with an empty div that has the `center_aligned_header_table` class: `<div class="center_aligned_header_table"></div>`.
* Do not introduce hard line breaks if possible. Therefore, avoid using the `<br/>` HTML tag and avoid [double spaces at the end of a line in Markdown](https://spec.commonmark.org/0.28/#hard-line-breaks).
* Single and double quote characters (`'` and `"`) are not converted to smart quotation marks automatically. To insert these, use `“` `”` and `‘` `’`.
* When referencing other articles, put their titles in quotes, e.g., `see the [“Lightweight Compression in DuckDB” blog post]({% post_url 2022-10-28-lightweight-compression %})`.
Expand Down
5 changes: 0 additions & 5 deletions _posts/2022-05-04-friendlier-sql.md
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,6 @@ CREATE TABLE mandalorian AS SELECT 1 AS "THIS_IS_THE_WAY";
SELECT this_is_the_way FROM mandalorian;
```

<div class="narrow_table"></div>

| THIS_IS_THE_WAY |
|----------------:|
Expand Down Expand Up @@ -163,7 +162,6 @@ Even as SQL fans, we know that SQL can learn a thing or two from newer languages
SELECT 'I love you! I know'[:-3] AS nearly_soloed;
```

<div class="narrow_table"></div>

| nearly_soloed |
|:---|
Expand All @@ -189,7 +187,6 @@ SELECT
FROM (SELECT ['A-Wing', 'B-Wing', 'X-Wing', 'Y-Wing'] AS starfighter_list);
```

<div class="narrow_table"></div>

| dont_forget_the_b_wing |
|:---|
Expand Down Expand Up @@ -249,7 +246,6 @@ FROM (
) theyre_coming_in_too_fast;
```

<div class="narrow_table"></div>

| tie_fighter | tie_fighter:1 |
|:---|:---|
Expand All @@ -270,7 +266,6 @@ JOIN sith_count_varchar s_char
ON s_int.sith_count = s_char.sith_count;
```

<div class="narrow_table"></div>

| sith_count | sith_count |
|---:|---:|
Expand Down
2 changes: 0 additions & 2 deletions _posts/2023-07-07-python-udf.md
Original file line number Diff line number Diff line change
Expand Up @@ -236,7 +236,6 @@ native_res = con.sql("SELECT sum(add_built_in_type(i)) FROM numbers").fetchall()
arrow_res = con.sql("SELECT sum(add_arrow_type(i)) FROM numbers").fetchall()
```

<div class="narrow_table"></div>

| Name | Time (s) |
|-------------|---------:|
Expand Down Expand Up @@ -295,7 +294,6 @@ con.sql("SELECT sum(strlen_arrow(i)) FROM strings tbl(i)").fetchall()
exec_external(con)
```

<div class="narrow_table"></div>

| Name | Time (s) | Peak memory consumption (MB) |
|-------------|---------:|-----------------------------:|
Expand Down
23 changes: 0 additions & 23 deletions _posts/2023-08-23-even-friendlier-sql.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,6 @@ SELECT
substr(intro, starship_loc + len('starship') + 1) AS trimmed_intro;
```

<div class="narrow_table"></div>

| intro | starship_loc | trimmed_intro |
|:---|:---|:---|
Expand All @@ -90,7 +89,6 @@ CREATE TABLE trek_facts AS
DESCRIBE trek_facts;
```

<div class="narrow_table"></div>

| column_name | column_type | null | key | default | extra |
|:---|:---|:---|:---|:---|:---|
Expand Down Expand Up @@ -124,7 +122,6 @@ SELECT
FROM trek_facts;
```

<div class="narrow_table"></div>

| episode_num | cnt_warp_speed_orders | highest_warp_speed_issued |
|:---|:---|:---|
Expand All @@ -145,7 +142,6 @@ SELECT
FROM trek_facts;
```

<div class="narrow_table"></div>

| max(trek_facts.cnt_warp_speed_orders) | max(trek_facts.highest_warp_speed_issued) |
|:---|:---|
Expand All @@ -165,7 +161,6 @@ WHERE
-- highest_warp_speed_issued >= 2
```

<div class="narrow_table"></div>

| episode_num | cnt_warp_speed_orders | highest_warp_speed_issued |
|:---|:---|:---|
Expand All @@ -184,7 +179,6 @@ SELECT
FROM trek_facts;
```

<div class="narrow_table"></div>

| max(trek_facts.<br>episode_num) | max(trek_facts.<br>aired_date) | max(trek_facts.<br>cnt_kirk_hookups) | ... | max(trek_facts.<br>bool_enterprise_saved_the_day) |
|:---|:---|:---|:---|:---|
Expand All @@ -198,7 +192,6 @@ SELECT
FROM trek_facts;
```

<div class="narrow_table"></div>

| max(trek_facts.<br>season_num) | max(trek_facts.<br>episode_num) | max(aired_date := <br>CAST(aired_date AS TIMESTAMP)) | ... | max(trek_facts.<br>bool_enterprise_saved_the_day) |
|:---|:---|:---|:---|:---|
Expand All @@ -219,7 +212,6 @@ WHERE
COLUMNS(col -> col LIKE '%warp%') >= 2;
```

<div class="narrow_table"></div>

| episode_num | cnt_warp_speed_orders | highest_warp_speed_issued |
|:---|:---|:---|
Expand All @@ -245,7 +237,6 @@ SELECT
FROM 'https://raw.githubusercontent.com/vlad-saling/star-trek-ipsum/master/src/content/content.json';
```

<div class="narrow_table"></div>

| starship |
|:---|
Expand Down Expand Up @@ -293,7 +284,6 @@ SELECT
.concat('.') AS im_not_messing_around_number_one;
```

<div class="narrow_table"></div>

| im_not_messing_around_number_one |
|:---|
Expand All @@ -312,7 +302,6 @@ SELECT
'.') AS oof;
```

<div class="narrow_table"></div>

| oof |
|:---|
Expand All @@ -336,7 +325,6 @@ CREATE TABLE proverbs AS
FROM proverbs;
```

<div class="narrow_table"></div>

| klingon_proverb | borg_proverb |
|:---|:---|
Expand All @@ -358,7 +346,6 @@ INSERT INTO proverbs BY NAME
SELECT * FROM proverbs;
```

<div class="narrow_table"></div>

| klingon_proverb | borg_proverb |
|:---|:---|
Expand Down Expand Up @@ -386,7 +373,6 @@ INSERT INTO purchases
FROM purchases;
```

<div class="narrow_table"></div>

| item | year | count |
|:---|:---|:---|
Expand All @@ -409,7 +395,6 @@ CREATE TABLE pivoted_purchases AS
FROM pivoted_purchases;
```

<div class="narrow_table"></div>

| item | 2155 | 2156 | 2157 |
|:---|:---|:---|:---|
Expand All @@ -430,7 +415,6 @@ UNPIVOT pivoted_purchases
VALUE count;
```

<div class="narrow_table"></div>

| item | year | count |
|:---|:---|:---|
Expand All @@ -457,7 +441,6 @@ SELECT
.list_transform(x -> x.string_split(' ')[1]) AS short_name;
```

<div class="narrow_table"></div>

| ship_name |
|:---|
Expand All @@ -471,7 +454,6 @@ SELECT
.list_filter(x -> x.contains('1701')) AS the_original;
```

<div class="narrow_table"></div>

| the_original |
|:---|
Expand All @@ -490,7 +472,6 @@ SELECT
IF x.contains('1701')] AS ready_to_boldly_go;
```

<div class="narrow_table"></div>

| ready_to_boldly_go |
|:---|
Expand All @@ -509,7 +490,6 @@ SELECT
casualties.*;
```

<div class="narrow_table"></div>

| gold_casualties | blue_casualties | red_casualties |
|:---|:---|:---|
Expand All @@ -529,7 +509,6 @@ FROM officers
SELECT officers;
```

<div class="narrow_table"></div>

| officers |
|:---|
Expand All @@ -554,7 +533,6 @@ SELECT 6 UNION ALL
SELECT 'First Contact';
```

<div class="narrow_table"></div>

| movie |
|:---|
Expand Down Expand Up @@ -582,7 +560,6 @@ SELECT
movie.num;
```

<div class="narrow_table"></div>

| movie | type | name | num |
|:---|:---|:---|:---|
Expand Down
15 changes: 0 additions & 15 deletions _posts/2023-09-15-asof-joins-fuzzy-temporal-lookups.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,6 @@ which can be cumbersome and slow to implement in standard SQL.
Let's start with a concrete example.
Suppose we have a table of stock [`prices`](/data/prices.csv) with timestamps:

<div class="narrow_table"></div>

| ticker | when | price |
| :----- | :--- | ----: |
Expand All @@ -51,7 +50,6 @@ Suppose we have a table of stock [`prices`](/data/prices.csv) with timestamps:

We have another table containing portfolio [`holdings`](/data/holdings.csv) at various points in time:

<div class="narrow_table"></div>

| ticker | when | shares |
| :----- | :--- | -----: |
Expand All @@ -77,7 +75,6 @@ FROM holdings h ASOF JOIN prices p

This attaches the value of the holding at that time to each row:

<div class="narrow_table"></div>

| ticker | when | value |
| :----- | :--- | ----: |
Expand Down Expand Up @@ -107,7 +104,6 @@ ORDER BY ALL;
As you might expect, this will produce `NULL` prices and values instead of dropping left side rows
when there is no ticker or the time is before the prices begin.

<div class="narrow_table"></div>

| ticker | when | value |
| :----- | :--- | ----: |
Expand Down Expand Up @@ -148,7 +144,6 @@ INNER JOIN state s
The default value of `infinity` is used to make sure there is an end value for the last row that can be compared.
Here is what the `state` CTE looks like for our example:

<div class="narrow_table"></div>

| ticker | price | when | end |
|:-------|------:|:--------------------|:--------------------|
Expand Down Expand Up @@ -227,7 +222,6 @@ But AsOf can now use any inequality, which allows it to handle other types of ev
To explore this, let's use two very simple tables with no equality conditions.
The build side will just have four integer "timestamps" with alphabetic values:

<div class="narrow_table"></div>

| Time | Value |
| ---: | ----: |
Expand All @@ -240,7 +234,6 @@ The probe table will just be the time values plus the midpoints,
and we can make a table showing what value each probe time matches
for greater than or equal to:

<div class="narrow_table"></div>

| Probe | >= |
| ----: | --- |
Expand All @@ -258,7 +251,6 @@ This shows us that the interval a probe value matches is in the half-open interv

Now let's see what happens if use strictly greater than as the inequality:

<div class="narrow_table"></div>

| Probe | > |
| ----: | --- |
Expand All @@ -278,7 +270,6 @@ This means that for this inequality type, the time is not part of the interval.

What if the inequality goes in the other direction, say less than or equal to?

<div class="narrow_table"></div>

| Probe | <= |
| ----: | --- |
Expand All @@ -303,7 +294,6 @@ when non-strict inequalities are used.

We can check this by looking at the last inequality: strictly less than:

<div class="narrow_table"></div>

| Probe | < |
| ----: | --- |
Expand All @@ -323,7 +313,6 @@ and it is a less than, so the time is the end of the interval.

To sum up, here is the full list:

<div class="narrow_table"></div>

| Inequality | Interval |
| -- | ---------- |
Expand Down Expand Up @@ -445,7 +434,6 @@ CREATE OR REPLACE TABLE probe AS (

The `build` table looks like this:

<div class="narrow_table"></div>

| k | t | v |
|---|---------------------|---|
Expand All @@ -457,7 +445,6 @@ The `build` table looks like this:

and the probe table looks like this (with only even values for k):

<div class="narrow_table"></div>

| k | t |
|---|---------------------|
Expand Down Expand Up @@ -501,7 +488,6 @@ than inequalities and generates a hash join with a filter.

Running the benchmark, we get results like this:

<div class="narrow_table"></div>

| Algorithm | Median of 5 |
| :--------- | ----------: |
Expand Down Expand Up @@ -551,7 +537,6 @@ INNER JOIN state s
AND p.k = s.k;
```

<div class="narrow_table"></div>

| Algorithm | Median of 5 runs |
| :--------- | ---------------: |
Expand Down
3 changes: 0 additions & 3 deletions _posts/2023-09-26-announcing-duckdb-090.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,6 @@ SELECT count(*) FROM (SELECT DISTINCT * FROM tbl);

If we keep all the data in memory, the query should use around 6GB. However, we can still complete the query if less memory is available. In the table below, we can see how the runtime is affected by lowering the memory limit:

<div class="narrow_table"></div>

| memory limit | v0.8.1 | v0.9.0 |
|--------------:|---------:|---------:|
Expand Down Expand Up @@ -135,7 +134,6 @@ SELECT
FROM tripdata;
```

<div class="narrow_table"></div>

| Version | Run time |
|--------:|---------:|
Expand All @@ -153,7 +151,6 @@ CREATE TABLE integers(i INTEGER PRIMARY KEY);
INSERT INTO integers FROM range(10000000);
```

<div class="narrow_table"></div>

| Version | Size |
| -- | --: |
Expand Down
Loading
Loading