diff --git a/_data/menu_docs_dev.json b/_data/menu_docs_dev.json index 54be41bb23..ad54268c39 100644 --- a/_data/menu_docs_dev.json +++ b/_data/menu_docs_dev.json @@ -939,6 +939,10 @@ { "page": "PostgreSQL Compatibility", "url": "postgresql_compatibility" + }, + { + "page": "SQL Quirks", + "url": "sql_quirks" } ] }, diff --git a/docs/sql/dialect/sql_quirks.md b/docs/sql/dialect/sql_quirks.md new file mode 100644 index 0000000000..af2116d8ac --- /dev/null +++ b/docs/sql/dialect/sql_quirks.md @@ -0,0 +1,75 @@ +--- +layout: docu +title: SQL Quirks +--- + +Like all programming languages and libraries, DuckDB has its share of idiosyncrasies and inconsistencies. +Some are vestiges of our feathered friend's evolution; others are inevitable because we strive to adhere to the [SQL Standard](https://blog.ansi.org/sql-standard-iso-iec-9075-2023-ansi-x3-135/) and specifically to PostgreSQL's dialect (see the [“PostgreSQL Compatibility”]({% link docs/sql/dialect/postgresql_compatibility.md %}) page for exceptions). +The rest may simply come down to different preferences, or we may even agree on what _should_ be done but just haven’t gotten around to it yet. + +Acknowledging these quirks is the best we can do, which is why we have compiled below a list of examples. + +## Aggregating Empty Groups + +On empty groups, the aggregate functions `sum`, `list`, and `string_agg` all return `NULL` instead of `0`, `[]` and `''`, respectively. This is dictated by the SQL Standard and obeyed by all SQL implementations we know. This behavior is inherited by the list aggregate [`list_sum`]({% link docs/sql/functions/list.md %}#list_-rewrite-functions), but not by the DuckDB original [`list_dot_product`]({% link docs/sql/functions/list.md %}#list_dot_productlist1-list2) which returns `0` on empty lists. + +## Indexing + +To comply with standard SQL, one-based indexing is used almost everywhere, e.g., array and string indexing and slicing, and window functions (`row_number`, `rank`, `dense_rank`). However, similarly to PostgreSQL, [JSON features use a zero-based indexing]({% link docs/data/json/overview.md %}#indexing). + +> While list functions use a 1-based indexing, `list_reduce` uses a 0-based indexing. This is a [known issue](https://github.com/duckdb/duckdb/issues/14619). + +## Expressions + +### Results That May Surprise You + + + +| Expression | Result | Note | +|----------------------------|---------|-------------------------------------------------------------------------------| +| `-1^2` | `1` | PostgreSQL compatibility means the unary minus has higher precedence than the exponentiation operator. Use additional parentheses, e.g., `-(1^2)` or the [`pow` function]({% link docs/sql/functions/numeric.md %}#powx-y) to avoid mistakes. | +| `'t' = true` | `true` | Compatible with PostgreSQL. | +| `1 = true` | `true` | Not compatible with PostgreSQL. | +| `1 = '1.1'` | `true` | Not compatible with PostgreSQL. | +| `1 IN (0, NULL)` | `NULL` | Makes sense if you think of the `NULL`s in the input and output as `UNKNOWN`. | +| `1 in [0, NULL]` | `false` | | +| `if(NULL > 1, 2, 3)` | `3` | Even though `NULL > 1` is `NULL`. | +| `concat('abc', NULL)` | `abc` | `list_concat` behaves similarly. | +| `'abc' || NULL` | `NULL` | | + + + +### `NaN` Values + +`'NaN'::FLOAT = 'NaN'::FLOAT` and `'NaN'::FLOAT > 3` violate IEEE-754 but mean floating point data types have a total order, like all other datatypes (beware the consequences for `greatest`/`least`) + +### `age` Function + +`age(x)` is `current_date - x` instead of `current_timestamp - x`. Another quirk inherited from PostgreSQL. + +### Extract Functions + +`list_extract` / `map_extract` return `NULL` on non-existing keys. `struct_extract` throws an error because keys of structs are like columns. + +## Clauses + +### Automatic Column Deduplication in `SELECT` + +Column names are deduplicated with the first occurrence shadowing the others: + +```sql +CREATE TABLE tbl AS SELECT 1 AS a; +SELECT a FROM (SELECT *, 2 AS a FROM tbl); +``` + +| a | +|--:| +| 1 | + +### Case Insensitivity for `SELECT`ing Columns + +Due to case-insensitivity, it's not possible to use `SELECT a FROM 'file.parquet'` when a column called `A` appears before the desired column `a` in `file.parquet`. + +### `USING SAMPLE` + +The `USING SAMPLE` clause is syntactically placed after the `WHERE` and `GROUP BY` clauses (same as the `LIMIT` clause) but is semantically applied before both (unlike the `LIMIT` clause). diff --git a/docs/sql/functions/list.md b/docs/sql/functions/list.md index b202c39d99..e5cbe897be 100644 --- a/docs/sql/functions/list.md +++ b/docs/sql/functions/list.md @@ -33,7 +33,7 @@ title: List Functions | [`list_intersect(list1, list2)`](#list_intersectlist1-list2) | Returns a list of all the elements that exist in both `l1` and `l2`, without duplicates. | | [`list_position(list, element)`](#list_positionlist-element) | Returns the index of the element if the list contains the element. If the element is not found, it returns `NULL`. | | [`list_prepend(element, list)`](#list_prependelement-list) | Prepends `element` to `list`. | -| [`list_reduce(list, lambda)`](#list_reducelist-lambda) | Returns a single value that is the result of applying the lambda function to each element of the input list. See the [Lambda Functions]({% link docs/sql/functions/lambda.md %}#reduce) page for more details. | +| [`list_reduce(list, lambda)`](#list_reducelist-lambda) | Returns a single value that is the result of applying the lambda function to each element of the input list. See the [Lambda Functions]({% link docs/sql/functions/lambda.md %}#reduce) page for more details. While list functions use a 1-based indexing, `list_reduce` uses a 0-based indexing. This is a [known issue](https://github.com/duckdb/duckdb/issues/14619). | | [`list_resize(list, size[, value])`](#list_resizelist-size-value) | Resizes the list to contain `size` elements. Initializes new elements with `value` or `NULL` if `value` is not set. | | [`list_reverse_sort(list)`](#list_reverse_sortlist) | Sorts the elements of the list in reverse order. See the [Sorting Lists]({% link docs/sql/functions/list.md %}#sorting-lists) section for more details about the `NULL` sorting order. | | [`list_reverse(list)`](#list_reverselist) | Reverses the list. | @@ -279,7 +279,7 @@ title: List Functions
-| **Description** | Returns a single value that is the result of applying the lambda function to each element of the input list. See the [Lambda Functions]({% link docs/sql/functions/lambda.md %}#reduce) page for more details. | +| **Description** | Returns a single value that is the result of applying the lambda function to each element of the input list. See the [Lambda Functions]({% link docs/sql/functions/lambda.md %}#reduce) page for more details. While list functions use a 1-based indexing, `list_reduce` uses a 0-based indexing. This is a [known issue](https://github.com/duckdb/duckdb/issues/14619). | | **Example** | `list_reduce([4, 5, 6], (x, y) -> x + y)` | | **Result** | `15` | | **Aliases** | `array_reduce`, `reduce` |