This is the periodic rollup release of features, bug fixes, and perf improvements for Q4 2024.
Interfaces are not guaranteed stable until 1.0.
Merged PRs
go-mysql-server
-
2798: cache session charset
perf: dolthub/dolt#8691 -
2796: apply table projections through
Distinct
nodes
We weren't pruning table columns when there was a distinct clause over the projections, this resulted the deserialization of every column, even if they weren't going to make it to the result. This is bad for performance, especially if the unread columns are ofTEXT
,LONGTEXT
, 'BLOB,
LONGBLOB` type as those are stored out of band, and take longer to deserialize.
fixes: dolthub/dolt#8689 -
2795: allow using function as table function
-
2794: Bump go-icu-regex
Incorporates the fix from here: -
2793: Revert byte copying optimization
-
2792: Fix the warning checks in enginetest to be more strict
Previously if an error was expected but none was produced the test would pass. -
2791: Properly cast
ENUM
s toTEXT
forCASE
andCONVERT
statements
fixes: dolthub/dolt#8598 -
2788: Fewer wire allocs in SQL() implementations
Optimizes SQL() implementations that convert interface values into type-specific byte arrays. Combination of skippingruntime.convT
checks, that unnecessarily allocated variables to the heap, redundant byte array copying, and other conversion inefficiencies.
dolt perf here: dolthub/dolt#8651goos: darwin goarch: arm64 pkg: github.com/dolthub/go-mysql-server/sql/types cpu: Apple M3 Pro │ before.txt │ after.txt │ │ sec/op │ sec/op vs base │ NumI64SQL-12 62.04n ± 2% 51.13n ± 1% -17.59% (p=0.002 n=6) Varchar10SQL-12 66.85n ± 1% 31.38n ± 2% -53.06% (p=0.002 n=6) TimespanSQL-12 62.36n ± 0% 40.31n ± 1% -35.35% (p=0.002 n=6) TimestampSQL-12 1960.0n ± 1% 255.0n ± 2% -86.99% (p=0.002 n=6) DatetimeSQL-12 1968.5n ± 0% 269.5n ± 3% -86.31% (p=0.002 n=6) EnumSQL-12 111.85n ± 1% 37.49n ± 1% -66.48% (p=0.002 n=6) SetSQL-12 175.15n ± 0% 63.55n ± 1% -63.72% (p=0.002 n=6) BitSQL-12 41.84n ± 1% 41.74n ± 1% ~ (p=0.589 n=6) DecimalSQL-12 683.8n ± 3% 281.9n ± 1% -58.77% (p=0.002 n=6) NumF64SQL-12 105.15n ± 1% 91.72n ± 0% -12.77% (p=0.002 n=6) geomean 189.2n 80.50n -57.45% │ before.txt │ after.txt │ │ B/op │ B/op vs base │ NumI64SQL-12 24.00 ± 0% 16.00 ± 0% -33.33% (p=0.002 n=6) Varchar10SQL-12 40.000 ± 0% 8.000 ± 0% -80.00% (p=0.002 n=6) TimespanSQL-12 24.00 ± 0% 16.00 ± 0% -33.33% (p=0.002 n=6) TimestampSQL-12 1520.00 ± 0% 56.00 ± 0% -96.32% (p=0.002 n=6) DatetimeSQL-12 1520.00 ± 0% 56.00 ± 0% -96.32% (p=0.002 n=6) EnumSQL-12 112.000 ± 0% 8.000 ± 0% -92.86% (p=0.002 n=6) SetSQL-12 136.00 ± 0% 16.00 ± 0% -88.24% (p=0.002 n=6) BitSQL-12 16.00 ± 0% 16.00 ± 0% ~ (p=1.000 n=6) ¹ DecimalSQL-12 439.0 ± 0% 228.0 ± 0% -48.06% (p=0.002 n=6) NumF64SQL-12 38.00 ± 0% 31.00 ± 0% -18.42% (p=0.002 n=6) geomean 108.0 24.94 -76.92% ¹ all samples are equal │ before.txt │ after.txt │ │ allocs/op │ allocs/op vs base │ NumI64SQL-12 2.000 ± 0% 1.000 ± 0% -50.00% (p=0.002 n=6) Varchar10SQL-12 2.000 ± 0% 1.000 ± 0% -50.00% (p=0.002 n=6) TimespanSQL-12 2.000 ± 0% 1.000 ± 0% -50.00% (p=0.002 n=6) TimestampSQL-12 42.000 ± 0% 3.000 ± 0% -92.86% (p=0.002 n=6) DatetimeSQL-12 42.000 ± 0% 3.000 ± 0% -92.86% (p=0.002 n=6) EnumSQL-12 3.000 ± 0% 1.000 ± 0% -66.67% (p=0.002 n=6) SetSQL-12 5.000 ± 0% 2.000 ± 0% -60.00% (p=0.002 n=6) BitSQL-12 2.000 ± 0% 2.000 ± 0% ~ (p=1.000 n=6) ¹ DecimalSQL-12 22.00 ± 0% 10.00 ± 0% -54.55% (p=0.002 n=6) NumF64SQL-12 3.000 ± 0% 2.000 ± 0% -33.33% (p=0.002 n=6) geomean 5.554 1.931 -65.24% ¹ all samples are equal
-
2787: Enable support for
caching_sha2_password
auth
This change enables users configured with thecaching_sha2_password
auth plugin to authenticate to a running SQL server. The default authentication plugin is stillmysql_native_password
, but users can opt-in tocaching_sha2_password
by creating a user and explicitly specifying the auth plugin.
Note that thecaching_sha2_password
auth plugin requires running the SQL server with a certificate so that TLS connections can be established.
Depends on: dolthub/vitess#390
Related to: dolthub/dolt#8496 -
2784: implement
EXPLAIN
andEXPLAIN PLAN
Moving our current implementation ofEXPLAIN
toEXPLAIN PLAN
, and replaceEXPLAIN
with a dummy implementation of MySQL'sEXPLAIN
Looks like this now:tmp2/main> explain select * from t; +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------+ | 1 | SELECT | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | | +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------+ 1 row in set (0.00 sec) tmp2/main> explain plan select * from t; +------------------+ | plan | +------------------+ | Table | | ├─ name: t | | └─ columns: [i] | +------------------+ 3 rows in set (0.00 sec)
related: dolthub/dolt#8592
-
2782: avoid fmt.Sprintf and string alloc for time.Sql
perf here dolthub/dolt#8640 -
2781: return ok result for
select into
statements
OurSELECT ... INTO ...
statements return empty result set, which produces strange output in thedolt sql
shell.
MySQL just returns ok results, so we should too.
discovered in: #2779 -
2780: Add support for creating users with the
caching_sha2_password
auth plugin
This change enables customers to create users configured to authenticate with thecaching_sha2_password
auth plugin. The generated authentication string uses the same logic as MySQL'scaching_sha2_password
auth plugin. Users created withcaching_sha2_password
can not yet authenticate with a GMS server – the next change in this series will enable that.
Example usage:CREATE USER fred@localhost identified with caching_sha2_password by 'pa$$w0rd';
Depends on: dolthub/vitess#387
Related to: dolthub/dolt#8496 -
2779: fix
AS OF
clause panic for certain expressions
We attempt to parse evalAS OF
expressions in the builder (because we assume it is going to be a literal), but Subqueries cannot be evaluated until after they have gone through the analyzer.
partially addresses: dolthub/dolt#8635 -
2776: bump mysql version
Certain tools expect a higher version of MySQL.
Currently, the latest stable version of MySQL is 8.4.4, but8.0.23
is the minimum needed to satisfy mydumper.
Additionally, this alters theversion()
method to select directly from the@@version
system variable.
related: dolthub/dolt#8592 -
2775: prevent creating and dropping
mysql
andinformation_schema
databases
fixes: dolthub/dolt#8621 -
2774: insert ignore to enum column truncates data
-
2773: Fixed error in setup found by user
-
2769: support
NO_AUTO_CREATE_USER
option insql_mode
Older MySQL 5.7 Docs; https://dev.mysql.com/doc/refman/5.7/en/sql-mode.html#sqlmode_no_auto_create_user
We already do this by default, so this is just parsing the option and does nothing. -
2767: Add support for
show slave status
This syntax is deprecated and will be removed in future MySQL versions (SHOW REPLICA STATUS
should be preferred instead). However, some tools (e.g. Dolphie, MyDumper) still rely on this deprecated syntax, so we're adding support for it to keep compatibility with those tools.
Depends on: dolthub/vitess#381 -
2766: add case for enum return types in multi-branch case statement
Hotfix for Enum conversion issue:
dolthub/dolt#8598
Doesn't address real issue, which revolves around conversion forCASE
. -
2765: increase column size for
information_schema.processlist.state
TheState
column in ourinformation_schema.processlist
table is different than the MySQL implementation because we include progress information. As a result, it is possible to exceed the 64 character limit that the column is defined as.
However, thius means deviating from MySQL's definition of the table. -
2764:
information_schema
uppercase rule doesn't apply toprocesslist
table
Selecting from information schema typically results in all uppercase column names, except for theinformation_schema.processlist
table, which follows the non info schema casing (matches the projection).
Relevant MySQL Bug: https://bugs.mysql.com/bug.php?id=84456 -
2763: Better error message for invalid charset string
This is a followup to the comments on this PR: -
2762: Add schema to StatsQualifier for schema databases
-
2761: add
innodb_buffer_pool_size
system variable
docs: https://dev.mysql.com/doc/refman/8.4/en/innodb-parameters.html#sysvar_innodb_buffer_pool_size -
2759: ignore empty strings in sets with multiple values
fixes: dolthub/dolt#8584 -
2758: Fix for latin1 issue
This potentially fixes:- dolthub/dolt#8580
There's a longer comment in the code itself, but the summary is that this is a character set conversion issue. Perhaps it's repertoires or something else, but hopefully this moves us closer to the logically correct behavior.
- dolthub/dolt#8580
-
2757: Refactor:
AuthServer
This change refactors theAuthServer
implementation in GMS to the newAuthServer
/AuthMethod
structure from dolthub/vitess#379
This is preliminary work to set us up to add support forcaching_sha2_password
authentication.
Related to: dolthub/dolt#8496
Depends on: dolthub/vitess#379 -
2756: Update README - fix db name when setting context
I'm pretty sure this needs to match withmemory.NewDatabase(dbName)
on line 135 -
2753: Fixes CTE issue with auth
This fixes the test introduced in:- #2745
Related PR: - dolthub/vitess#377
- #2745
-
2751: Fix drop view if exists for doltgres
-
2750: Fix alter NULL enum value panic
-
2749: Support create and drop view with schema name
-
2748: use
equal
method for scopecolumn
This fixesDoltgresType
comparison issue -
2747: Add schema name to definitions for views, triggers, procedures
-
2741: fix: store the hashed password to 'authentication_string' (to #2740)
This PR addresse the issue #2740.- resolve #2740
-
2739: Embed the
embedded.Tracer
into memTracer, to fulfil the updated interface
I was trying to use this projectgo-mysql-server
as a dependency alongside github.com/grafana/grafana-plugin-sdk-go/data. But the latter project depends on:- go.opentelemetry.io/otel v1.31.0
- go.opentelemetry.io/otel/trace v1.31.0
When I updated my go modules withgo mod tidy
I started to hit a compilation error:
# github.com/dolthub/go-mysql-server/test /Users/samjewell/go/pkg/mod/github.com/dolthub/[email protected]/test/mem_tracer.go:22:2: could not import go.opentelemetry.io/otel/trace/embedded (open : no such file or directory) FAIL github.com/grafana/poc-sql-expressions-embedded-sql/dolthub-go-mysql-server [build failed] FAIL
Hence this change 🙏
-
2738: support multi-inherit
syntax: dolthub/vitess#375 -
2736: use Equals method for sql.Types
The newDoltgresType
struct cannot be compared with==
, so should usesql.Type.Equals()
function. -
2734: always unwrap mysql priv db
fixes: dolthub/dolt#8530 -
2733: disabling read-comitted
-
2732: Get field string concat
Replacefmt.Sprintf
with string concat (+).BenchmarkSprintf-12 18748582 63.29 ns/op BenchmarkAddString-12 39934550 27.61 ns/op
-
2729: move
applyEventScheduler
logic andeventscheduler
to builder -
2727: fix dropped error in createView
-
2726: remove
resolveDescribeQuery
rule
This rule just recurses into the analyzer, so we should just do it right at the beginning, so we should do it right away and avoid calling all other rules.
This would be better/cleaner in the planbuilder, but we would need to tell the analyzer to not analyze this node or skip it somehow.
It's possible to move this logic intoanalyzeWithSelector
.
It might be more difficult to move this aboveAnalyze
, as we'd have to put this logic in a variety of places. -
2723: server/handle.go: doQuery cleanup of some error handling.
-
2722: internal/sockstate: Restore connection checking on Linux.
GMS server handler is supposed to cancel running queries if the connection which issued them goes away. It does this by checking the connection state out-of-band anytime the query is running and canceling theh query if the connection goes away. The connection checking code is platform-specific and currently only works on Linux.
In commit 538696b I introduced a bug where the connection checking code tries to inspect the socket state of an already closed file descriptor. This change fixes the behavior so that the file descriptor is left open until the necessary socket state is extracted. -
2721: Additional support for vector indexes.
This PR adds the vector function VEC_DISTANCE to GMS, as well as support for adding and altering vector indexes, and support forSHOW CREATE
on tables with vector indexes.
Vector indexes are not yet supported in Dolt. The corresponding version bump in Dolt will have checks preventing vector indexes from being added to Dolt tables. -
2720: move
wrapTriggerRollback
logic
This PR removes theplan.TriggerRollback
andplan.NoopTriggerRollback
node and applies thetriggerRollbackIter
directly.
When there are multiple nested triggers (aka triggers that call other triggers), we used to have multiple nested trigger rollbacks with different savepoints. It turns out that was unnecessary and a single savepoint at the top of the iter tree is enough; when there's an error, it propagates back to the top, rollingback all changes. -
2719: move
applyUpdateAccumulators
This PR deletes theplan.RowUpdateAccumulator
node, and directly injects anaccumulatorIter
into the rowexec tree infinalizeIters
.
Doltgres fix: dolthub/doltgresql#909 -
2718: use sqlCtx in handler
Tests here: dolthub/dolt#8493 -
2717: push
Distinct
nodes belowSort
nodes
This PR optimized queries like:select distinct i from t order by i;
When there are many duplicate values in column
i
, it is much more efficient to eliminate duplicates first, then sort the results.
There are somewhat unrelated optimizations left as TODOs.
Optimize query: dolthub/dolt#8488 -
2716: Update Parser interface documentation
Updates documentation for theParser
interface to document the requirement that implementations must returnvitess.ErrEmpty
for empty queries. Without this, GMS will not handle empty queries correctly.
Related to: dolthub/doltgresql#884 -
2715: remove exchange node
These aren't used anywhere, so it's getting removed.
Partition still exists if we ever want to reimplement some version of this. -
2714: remove
plan.QueryProcess
and move logic tofinalizeIters
This PR removes the use ofplan.QueryProcess
and part of thetrackProcess
rule that adds/removes this node.
Instead theTrackedRowIter
is created directly infinalizeIters
. -
2713: Support for schema names in foreign key definitions
Adds schema names in many places required for foreign key operation in databases with schemas (doltgres) -
2712: Planbuilder Authorization
- dolthub/vitess#372
What's missing: - SHOW commands aren't in yet
information_schema
doesn't restrict it's output yet- Need far more robust testing than what currently exists
I thinkSHOW
andinformation_schema
will probably have the same solution, which may be to continue doing what we were doing before. Besides that, pretty much every works according to our current tests (outside of the aforementioned missing items).
- dolthub/vitess#372
-
2710: Changing
selectExprNeedsAlias
to consider string literal quotes
When fixing a string literal quoting issue in Doltgres (dolthub/doltgresql#868), table functions stopped working, because theselectExprNeedsAlias
started returningtrue
to indicate that an alias was needed (even though it wasn't) and thebuildTableFunc
function would fail.
This started happening because the quoted string literals no longer matchedInputExpression
, sinceInputExpression
always trims off quotes when it is assigned. Another solution could be to expose thetrimQuotes
function from Vitess and use it to trimexpr.String()
before matching againstInputExpression
. -
2709: fix
LOAD DATA
64K buffer limit
This PR increase the buffer size forbufio.Scanner
toLongTextMax
, so hopefully nobody attempts to load a single row larger than 4GB.
Other changes:- moves ignore lines logic to within the
loadDataIter
- drops extra
scanner.Text()
logic - use byte comparison instead of string cast and string comparison
benchmark: dolthub/dolt#8467
fixes: dolthub/dolt#8469
- moves ignore lines logic to within the
-
2708: removing transaction committing node
doltgres fix: dolthub/doltgresql#872 -
2704: fix cascade for foreign keys with multiple references
This PR fixes an issue where foreign keys referencing the same column would fail to update the parent table (only on GMS).
Turns out we were adding the updater in the chain under the parent table's name instead of the child's name.
Additionally, has some small refactoring to tidy up the foreign key chain code.
fixes: #2671 -
2703: Minor changes for implicit prefix lengths in Doltgres
Minor changes to support dolthub/doltgresql#829 -
2702: fix: make SET system type case-insensitive
Resolves #2701 -
2700: compare and convert system types properly
Changes:IsType()
methods now account for system variable types,- coalesce type comparison accounts for system variable types
create table from select ...
statements account for system variable types
Note: MySQL lists@@admin_port
as Integer in their docs, but shows up asUInteger
in the CLI
fixes: dolthub/dolt#8448
-
2699: resolve column defaults for views
Properly display defaults for views.
Also fixes defaults for views with filters.
fixes: dolthub/dolt#8447 -
2697: expose planbuilder methods for domain support in Doltgres
-
2696: clear warnings better and separate warning count from actual warnings
This PR cleans up the logic surrounding warnings and clearing them.
The important part was separating the number of "new" warnings from the list of warnings themselves.
Every query should clear out the warnings from the previous query. The exception isshow warnings
, which only clears the "count" of warnings.
When a server runs a query that produces a warning, it immediately callsshow warnings
. Since theshow warnings
query itself should NOT clear the warnings, the warning count would always be > 0, and incorrectly indicate thatshow warnings
produced a warning. This causes an infinite loop in.NET
. Now, we always clear the warning count, and only clear the warnings themselves when the query is notshow warnings
.
We've also had this weird problem of having to double clear warnings, which this should address.
fixes: -
2695: add view column information to
information_schema.columns
This PR has information_schema reparse create view statements to fill in missing information for theinformation_schema.columns
table. There maybe small differences in column type for more complex views, but should be fine.
Implementing usingview.TextDefinition
would be slightly easier/cleaner, but for some reason its sometimes empty? Seems like something to do with fragments on dolt side.
fixes: dolthub/dolt#3168 -
2692: alter underlying enum values to preserve enum strings
fixes: dolthub/dolt#7472 -
2691: Bug fix for DESCRIBE with a schema name
-
2690: only check references for updated columns
fixes: dolthub/dolt#2690 -
2689: Support for schema name in SHOW TABLE statements
This provides support forshow tables from public
andshow table from mydb.public
in postgres dialects. -
2685: unbump protobuf
-
2683: use global parser for engine
-
2681: Add support for setting/dropping column type and nullability
Allows altering column type and a column's not null constraint without having to respecify the full column definition. Needed for Postgres' more modularALTER TABLE
syntax.
Dependent on: dolthub/vitess#367 -
2679: sql/rowexec: merge_join.go: Fix dropped error in incLeft and incRight.
This fixes a bug where certain joins were not able to be canceled in a timely manner by canceling their Context. -
2678: Support information_schema views/tables hooks for doltgres
-
2676: Optimization: Defer Projections for Server Queries
This PR speeds up spooling queries from the server to client when there is a top level projection.
Changes include:- remove unnecessary double allocation of NewROw
- defer projections until RowToSQL to avoid one extra allocation of
sql.Row
Additionally, this PR cleans up and refactors some code surrounding projections.
-
2674: refactor
resultForDefaultIter
Reorganizes some code, and removes one unnecessary sql.Row allocation.
Tiny performance improvement. -
2673: Small Refactor to
pruneTables
-
2672: More flexible error checking in enginetests
-
2669: implement
validate_password_strength()
MySQL Docs: https://dev.mysql.com/doc/refman/8.4/en/encryption-functions.html#function_validate-password-strength -
2668: implement
compress()
,uncompress()
, anduncompressed_length()
MySQL Docs:- https://dev.mysql.com/doc/refman/8.4/en/encryption-functions.html#function_compress
- https://dev.mysql.com/doc/refman/8.4/en/encryption-functions.html#function_uncompress
- https://dev.mysql.com/doc/refman/8.4/en/encryption-functions.html#function_uncompressed-length
The library we are using iscompress/zlib
, which is slightly different than the MySQLzlib
implementation. As a result, the actually compressed data is similar but not equivalent. However, this library is still able to uncompress any MySQL compressed data.
There isczlib
, but it is not actively maintained and might requirecgo
.
-
2667: Fix str_to_date function
STR_TO_DATE
function cannot parse "%Y%m%d".
I mentioned it in the issue #2666 -
2665: Fix FunctionalDependencies for NonUnique, NonNull indexes on Server
When using the server engine, we return an error for indexes defined over non-unique not null columns.
This meant that filters over these columns would incorrectly return error when there were duplicate entries.
Oddly, this only happens on server engine and not using dolt sql-shell directly.
The bug stems from a missing check when gathering functional dependencies for equalities.
Related: dolthub/dolt#8365 -
2664: Updated go-icu-regex
-
2663: [stats] convert binary to string for stats bound rows
tests on Dolt side for round-trip serialization: dolthub/dolt#8361 -
2662: implement
charset()
function
MySQL Docs: https://dev.mysql.com/doc/refman/8.4/en/information-functions.html#function_charset -
2661: Example in-memory Vector index using the existing index APIs.
This expands the index interfaces to make it possible to have vector indexes, and demonstrates it with a proof-of-concept in-memory index. It's a rough implementation with some shortcomings: for instance, it doesn't currently handle tables that consist of multiple partitions.
However, this showcases how to use the GMS interfaces to add a vector index. -
2657: support
get_format()
MySQL Docs: https://dev.mysql.com/doc/refman/8.4/en/date-and-time-functions.html#function_get-format -
2656: handle user variables in load data
This PR adds support for assigning user variables within aLOAD DATA
query, and referencing user variables withing an expression forSET
clauses within the sameLOAD DATA
query.
Additionally, this PR refactors much of theLOAD DATA
code.
We also movedGetField
indexing out of iterator and into analyzer
fixes: dolthub/dolt#8307 -
2653: support
SET <col> = <expr>, ...
clause inLOAD DATA
partial support for: dolthub/dolt#8307 -
2652: Add
sql.ContextProvider
interface
Doltgres needs a way to get asql.Context
in order to call some functions. This change adds a newsql.ContextProvider
interface and hasHandler
implement it.
See dolthub/doltgresql#685 for the code that needs to call this new interface. -
2651: fix collations with recursive hash for
HashInTuple
expressions
When hashing tuples, we don't take into account the collation of the individual elements, which results in incorrect comparisons for case insensitive collations.
The fix here was to recursively hash each individual element in a tuple, and hash the slice of hashes.
fixes dolthub/dolt#8316 -
2650: conflict in
EqualityRangeBuilder
Fix bug in equality range builder where duplicate equality filters to one value would stomp each other.
fixes: dolthub/dolt#8317
dolt-side: dolthub/dolt#8322 -
2648: use sqlparser.Expr rather than querypb.BindVariabl
-
2645: Fix revision databases not showing up for schema databases
-
2644: Fix Integration Workflow
This makes a few changes to how the integration workflow is handled:main
is now merged into this PR before testing. Previously, we would just use the PR as-is, which created a situation where themain
branch included changes that the PR had not yet merged in. This created an issue as Dolt and DoltgreSQL would expect some changes to be present that were not yet merged into the PR, causing compilation errors. By mergingmain
, we bypass this. In addition, the workflow will automatically pass if a merge conflict is detected. I think this is fine, since conflicts must be resolved before the PR can be merged anyway. This does mean that some errors may not be caught for as long as merge conflicts againstmain
exist.- We only pulled comments, and the PR description does not count as a comment. This made it seem a bit inconsistent with how PR detection was handled. This has now been added, and we're now doing a basic string search instead of using a JSON parser, as concatenating the comments and description does not result in a valid JSON object.
- Workflow should automatically run when a comment is added, modified, or deleted. This was already supposed to happen, but the event structure differed between a comment and push in a subtle way, causing the workflow to immediately error and for the UI to continue displaying the previous run. This made it seem as though the workflow did not respond to comment updates.
- Additional logging messages have been added, so it's easier to debug if something goes wrong in the future.
-
2641: Correctly handle indexes on virtual columns
Fixes dolthub/dolt#8276
Lots of small behaviors around virtual columns were not working correctly:- Adding an index on a virtual column triggered a table rebuild even when this wasn't necessary
- Rebuilding a table that contained virtual columns could lead to incorrect results
- Inserting into a table with a virtual column could update indexes incorrectly
- Adding a generated column to the start of a table could lead to incorrect results
This PR adds tests for these cases and fixes them by tweaking the logic for projections on tables with generated columns.
-
2638: Change ranges to an interface
-
2637: Adding a test combining duplicate indexes through
create table
andalter table
Additional test case for #2634 -
2636: Add additional engine tests for indexes on generated columns.
These tests verify the correctness of dolthub/dolt#8273.
Some of the checks in these tests are skipped because they run afoul of dolthub/dolt#8275 and dolthub/dolt#8276 -
2635: Fix multi-statements in nested triggers
changes:- have different savepoint names
- this fixes nested triggers overwriting save points and clearing the same savepoint
- handle aliases independently for each statement in trigger blocks
- sync up prepend and scope nesting for triggers
- wrap
applyTrigger
rule wrapstriggerExecutor
s over individual statements inBeginEndBlock
s - this prevents wrapping
triggerExecutor
s over the wrong statements (not matching event or table)
related: dolthub/dolt#8213
-
2634: Adding tests for supporting duplicate secondary indexes
New tests asserting that multiple indexes over the same set of columns can be created on tables.
dolthub/dolt#8274 fixes Dolt for these tests to pass. -
2633: handle view resolution when the schema is not defined in SELECT stmt
-
2631: include parent scopes in
update
anddelete
triggers
partially fixes dolthub/dolt#8213 -
2629: normalize column defaults
This PR adds a new analyzer rule to normalize literal column default values.
This rule ensures that the default value is consistent for the column type (float defaults over int columns are rounded properly).
It does this by evaluating the column default, and placing that into a NewLiteral of the proper type.
Additionally, this ensures that dolt serialization receives consistent values (normalized floats and proper types).
fixes: dolthub/dolt#8190 -
2628: Aggregation/range scan perf edits
- Resolving aggregation functions checks the integrator function provider first. The interface required us to create an expensive error object to resolve any standard functions. Now we only create the error object if a function name is not found in integrator and standard list.
- Skip duplicate unused functional dependency tracking in coster.
- Aggregations with no group by and single scopes can only return one row, so use the
sql.QFlagMax1Row
shortcut.
benchmarks here: dolthub/dolt#8241
-
2627: Fix error when comparing incompatible types in
IndexLookup
s
When building lookups forIndexedTableAccess
, we always convert the key type to the columns type.
This is problematic when the key can't be converted to the column type without error.
The expressions used inFilters
properly handle this conversion, so we should default to that.
Example:tmp/main*> create table t (i int primary key); tmp/main*> select * from t where i = json_array(); error: '[]interface {}' is not a valid value type for 'int'
This doesn't errror in MySQL. Also without a primary key or secondary index, the query succeeds in dolt.
-
2625: Bug fix: the
timestamp
function should convert to adatetime
type
MySQL'stimestamp
function, despite its name, actually returns adatetime
type and not atimestamp
type.
MySQL example:mysql -uroot --protocol TCP -e "select timestamp('1000-01-01 00:00:00');" --column-type-info Field 1: `timestamp('1000-01-01 00:00:00')` Catalog: `def` Database: `` Table: `` Org_table: `` Type: DATETIME Collation: binary (63) Length: 19 Max_length: 19 Decimals: 0 Flags: BINARY +----------------------------------+ | timestamp('1000-01-01 00:00:00') | +----------------------------------+ | 1000-01-01 00:00:00 | +----------------------------------+
Note: We still need to add support for the second, optional parameter to
timestamp()
.
Customer issue: dolthub/dolt#8236 -
2624: Use ctx.Done() as a faster check for ctx.Err()
The err call is noticeable for queries that read a lot of rows. -
2623: Fix anti-join correctess bug
We had some strange logic for accepting a join anti-match, ripped it out and everything seems to be working correctly now. -
2621: implement
icu_version
function
MySQL Docs: https://dev.mysql.com/doc/refman/8.4/en/information-functions.html#function_icu-version -
2620: implement
name_const
function
MySQL docs: https://dev.mysql.com/doc/refman/8.4/en/miscellaneous-functions.html#function_name-const -
2619: Assume text index comparisons are exact
We currently do not eliminate filters of the formcolumn(VARCHAR) = text literal (longtext)
when pushing filters into index lookups. The safety check is necessary at least for datetimes, spatial/fulltext and partial TEXT indexes. It's not clear whether it is necessary for full varchar indexes.
dolt side seems OK: dolthub/dolt#8218 -
2618: More aggressively elide IN filters used for indexed lookups
re: dolthub/dolt#8215 -
2617: More QueryProps, missed max1rowiter usage
-
2616: fix insert id
The logic setting the InsertID in OkResult, did not match results returned fromlast_insert_id()
.
This was made apparent due to changes from #2614.
For a single insert statement, MySQL sets InsertID exactly once when the AutoIncrement on the column is first triggered.
While the linked PR fixes that issue and properly sets the session variable, ourinsertRowHandler
(which is responsible for returning OkResult structs) was setting InsertID incorrectly.
The fix is to just read the LastInsertID from the session, since it is already set to the right value. -
2615: faster status updates
System variables can be session, global, or both.sql.IncrementStatusVariable
is a helper method that primarily helps the "both" category increment the global and session counters for certain variables.Threads_running
is a global only variable that is incremented/decremented every begin/end query, and gets a lot of traffic. The old code usedsql.IncrementStatusVariable
to incrementThreads_running
, which was a particularly expensive way to increment a global var because (1) we'd make a new error for every call to the session updater, and (2) the extra map lookup is unnecessary. We don't do the extra map lookup now, and we weren't using the error return so I removed the return variable.
Note: this also refactors status variables to be explicitly initializated in the engine
bump/perf here: dolthub/dolt#8189 -
2614: update
LAST_INSERT_ID
when auto incrementing fromempty
,NULL
, andDEFAULT
Our logic for determining whether or not we needed to update last insert id only looked at the insertSource schema.
This does not take into considerationempty
,NULL
orDEFAULT
values.
Additionally, the value that last insert id is set to depends on what the auto increment value will be.
This PR addresses those issues.
Also, has some refactoring for readability.
fixes: dolthub/dolt#7565 -
2613: Query properties rule filtering
Edit most of the analyzer interfaces to pass a new context object that accumulates query specific properties. Currently the object is calledQueryFlags
, and accumulates information about the query to inform better rule filtering and more efficient spooling strategies.
The change that has the biggest effect onoltp_point_select
perf is thesql.QFlagMax1Row
setting, which lets us skip the default results iter boilerplate when we're only returning one row. Added a couple other skips for rules that are easy to whitelist correctly and show prominently on CPU profiles, like aggregations and subqueries. -
2610: use datetime precision in information_schema.columns.datetime_precision
When determining if a schema change occurred, one of the tables Prisma looks at is theinformation_schema.columns
.
Here, we incorrectly mark alldatetime
andtimestamp
columns as with a precision of 6.
If a table has typeDATETIME(3)
, Prisma would think there was a schema change, and perform a migration when one isn't needed.
This PR addresses this issue by having theinformation_schema.columns
table accurately reflect the datetime preicision of the columns,
fixes: dolthub/dolt#8173 -
2609: fix output type for
DateAdd()
andDateSub()
functions
The output ofDateAdd()
,AddDate()
,DateSub()
, andSubDate()
, changes if the input is a properly formatted string vs a date/datetime/time/timestamp.
fixes: dolthub/dolt#7304 -
2606: [memo] assume self-join stats cardinality continuity
Self-join stats estimation is particularly expensive because all of the buckets exactly overlap. If the index is unique, the cardinality distribution will not change. If the index is non-unique, the cardinality will expand proportional torowCount/distinctCount
.before BenchmarkOltpJoinScan-12 1766 694524 ns/op 462834 B/op 8240 allocs/op after BenchmarkOltpJoinScan-12 2460 481166 ns/op 193569 B/op 7129 allocs/op
sysbench perf here: dolthub/dolt#8159
-
2605: add lock to prevent warnings from being cleared
This PR adds two functions to BaseSession that toggle a boolean, so integrators can prevent warnings from being cleared.
This is mostly useful fordolt sql shell
.
addresses dolthub/dolt#8016 -
2604: Index searchable edits
We previously added support for integrators choosing their own indexes with ansql.IndexSearchable
interface. This was for a customer use case. This PR expands the interface to let Dolt cache information about strict key lookups.
The motivation is that (1) strict key lookups will always be the best-case scenario result of index costing, (2) caching this information in-between ALTER statements is usually a long enough lifecycle for the overhead to be worth it.
I added a streamlined range builder as part of this optimization that only accepts literal values in the order expected by the target lookup. The user of this range builder takes responsibility for feeding the values in the correct order. As a result, we sidestep expensive string formatting, map creating, and map lookups during range building.
Follow-on fixes to functional dependencies permuted plans a bit more. Inner joins are chosen more frequently in some of our test plans now that we're reflecting strict key max-1-row cardinalities. -
2603: [memo] reorder should add new plans to intermediate expr join child
There was a bug where we'd add reordered join plans to project or distinct nodes, rather than their join children. Code comment explains more clearly how this works. -
2602: disallow forward slash in database name
fixes dolthub/dolt#8126 -
2601: Fixed error in converting panics to errors
-
2600: Support information_schema.columns hook for doltgres
-
2598: Adding the
binlog_expire_logs_seconds
system variable
MySQL@@binlog_expire_logs_seconds
reference docs -
2596: Fix information_schema.columns for databases with schemas
Missed this table in my original PR -
2595: Adding user name and host length validation to CREATE USER
This change matches MySQL's behavior of limiting user names to 32 chars and host names to 255 chars. Attempting to create a user with a name or host longer than that limit now returns a validation error.
Customer issue: dolthub/dolt#8120 -
2594: Fixed REGEXP
This fixes the case-sensitivity issue found in: dolthub/dolt#8117
Although we had movedREGEXP_LIKE
to the ICU engine, we forgot to also moveREGEXP
, which is a synonym forREGEXP_LIKE
according to the docs. This makes that change, and also completely removes all remnants of the old regex code. -
2593: custom row exec
Additions for custom row operators on Dolt side: dolthub/dolt#8072 -
2591: Bug fix: Fix
@@binlog_row_metadata
, add@@binlog_row_image
Fixes an error in the definition for the@@binlog_row_metadata
system variable that prevented it from being queried. Adds the@@binlog_row_image
system variable that was missing. -
2590: Fix databases iter for information_schema tables
Doltgresinformation_schema
tables include the schemas for the current database, not all databases -
2588: Only register
version
function if it doesn't already exist -
2587: Fix information_schema catalog and schema names
We had already made this change for doltgres here, this just applies it to moreinformation_schema
tables -
2583: [stats] Disable histogram bucket merging for now because it mutated shared memory
Merging buckets in the current format is unsafe:- we collect statistics for an index where two buckets have overlapping values
- we execute a join using the index with overlapping values, and use a merge algorithm to combine those buckets. The merged bucket is synthetic, but the statistics used for the join is also synthetic, so this all works as expected.
- a future indexscan selects the compressed range from before, accessing one of the synthetic buckets created by the join
- we error
invalid bucket type: *stats.Bucket
at the end of the indexscan when adding the filtered histogram with a synthetic back to the implementor-type statistic
EditedmergeOverlappingBuckets
to not share memory, but also I'm not sure if merging buckets is a common performance win in most cases, so disabling for now
-
2581: [stats] populate types for nil zeroing
-
2580: Remove a duplicate column from information_schema
Just what it says on the tin. This duplicate column causes problems for DuckDB when attempting to connect to doltdb databases. -
2578: Catch panics in listener
We had several goroutines in which panics would crash the server process. These were being triggered by panics due to errors in doltgres. Added a recover block to each goroutine. -
2577: calling JSON_EXTRACT and JSON_VALUE with a path that has an out-of-bounds array access should return SQL NULL, not an error.
The jsonpath module returns an error when performing a lookup with an out-of-bounds array index. We need to capture that error and return nil for the lookup operation instead. -
2576: fix type and precision for
unix_timestamp
builds off of: #2573 -
2572: fix for
table_catalog
forinformation_schema.tables
-
2571: Reverting
performance_schema
database
After working more with the customer, we identified that the tool (Flyway) was actually gracefully handling theperformance_schema
query errors, and the real cause of the incompatibility was a different issue. This reverts commit 9df14f8, which stubbed out an emptyperformance_schema
database. -
2570: Added infoschema to privilege check
This fixes: dolthub/dolt#8052
In the analyzer, we make a check to determine if we're querying the information schema. The queries provided in the issue that do not work are regarded as subqueries, and these are explicitly ignored. This causes the privilege checker to look for the information schema tables by name, which is not the intended behavior.
This PR just adds an additional information schema check at a lower layer, which should remove the inconsistencies found from the queries provided in the issue. -
2569: add SchemaName to DatabaseSchema interface
This method returns schema name. Schema name for Doltgres and database name for Dolt. -
2568: Feature:
performance_schema
Stubbing out the start of theperformance_schema
database.
Customer Issue: dolthub/dolt#8051
Dolt PR: dolthub/dolt#8061
Dolgres PR: dolthub/doltgresql#424 -
2567: Add support for
SHOW BINARY LOGS
When theSHOW BINARY LOGS
statement is executed, GMS will invoke the registeredBinlogPrimaryController
to ask it for the list of binary logs and send them back to the client. -
2566: Renamed index functions and enums to be public
This renames theindexScanOp
enum so that it's accessible from outside the package, and also replaces the type switch innewLeaf
with a replaceable function that can be overridden from outside the package to support types that are not native to GMS. -
2565: When a subroutine (like
CREATE PROCEDURE
contains a subqeury, correctly index into it.
Fixes dolthub/dolt#8028
We didn't have tests for constructs containing nested subroutines (likeCREATE PROCEDURE foo() CREATE PROCEDURE bar() SELECT 1;
This PR adds tests for those, but tests where we don't currently match MySQL are disabled. There's enough enabled tests to show that statements like this no longer cause a panic.
Making sure that we match MySQL for these statements should be done in a follow-up. -
2564: validate all string types for unicode
MySQL throws errors on invalid utf8 encoded strings. A previous PR detected those, but only for[]byte
string conversions. Prepared statements receive the string parameters as astring
type, so this PR moves the check for all conversions.
Additionally, it adds bindings toAssertErr
andAssertErrWithCtx
methods.
related pr: #2562
dolt pr: dolthub/dolt#8060
fixes dolthub/dolt#8040 -
2563: small changes to stats bucket counting
Joins don't track output MCVs anymore, they aren't in a format where they'd be useful anyways. Also assume MCVs are sorted for faster matching. -
2562: throw error on invalid utf8 encoding for strings
fixes dolthub/dolt#8040 -
2561: Fixes unexpected timezone converting when passing TIMESTAMP to unix_timestamp()
see #2111 -
2560: fix
GetField
indexes forUpdateJoin
withUpdate Trigger
This PR addresses an issue where we were incorrectly assigningGetField
indexes to anupdate join
query.
The fix involved:- adding a case for
triggerIters
torowUpdateAccumulator
- not picking
ResolvedTable
references underSubqueryAliases
when there are multiple - correctly setting the scope node for update joins
fixes dolthub/dolt#7943
- adding a case for
-
2559: Implement support for
DECLARE CONTINUE HANDLER
Fixes dolthub/dolt#7971
Previously, we would always terminate a block when encountering an error, even if there's a matching handler. Additionally, there was no mechanism to resume an error that happened inside aLOOP
construct.
This correctly implementsDECLARE CONTINUE HANDLER
by making the following changes:- Checks for handlers while executing the
Block
node instead of theBeginEnd
node. - For
DECLARE EXIT HANDLER
, theBlock
returns a special error value that propagates to the containingBeginEnd
node in order to terminate just that node.
- Checks for handlers while executing the
-
2556: Compute GetField indexes in procedure if-conditions.
Fixes dolthub/dolt#7994
It seems like we aren't running theassignExecIndexes
analysis pass on if-conditions when invoking stored procedures, which can cause execution failures if the condition has a sub-expression that has aGetField
node.
Fixing this revealed a related issue: when constructing the scope of the if-condition to determine the correct indexes, we were incorrectly including any columns from the condition's body in the scope. This was also causing incorrect index calculations forGetFields
in the if-condition, and is also fixed here.
(This only affected conditionals in stored procedures, not conditionals in triggers, because the analysis has a separate execution path for each;analyzeProcedureBodies
is not called for triggers.) -
2555: [stats] simplify stats comparison and mcv logic
Lazier comparison logic. Skip promoting/converting types when the index types match.
Remove an expensive and seemingly unnecessary bucket compression step that was re-evaluating mcvs. -
2552: support
VALUES
statement
fixes dolthub/dolt#8012
syntax: dolthub/vitess#354 -
2551: unwrap parenthesized table references
fixes dolthub/dolt#8009 -
2546: Add support for tracking the
Aborted_connects
status variable
Adds support for MySQL'sAborted_connects
status variable.
Depends on: dolthub/vitess#351 -
2542: When casting json to a string, always call StringifyJSON.
This ensures we match MySQL.
We previously weren't calling StringifyJSON inConvertToString
because that same method was being used when printing JSON to the screen or a MySQL client, which favored speed over matching MySQL exactly. But for casts we must be precise.
By adding an extra case toStringType.SQL
we can distinguish between these cases and handle them properly. -
2541: resolve default values for views
This was somewhat of a regression caused by #2465.
However, before that PR views always hadNULL
as their default values, which did not match MySQL.
Now, we just resolve the default values in the schema, similar toResolvedTables
.
fixes dolthub/dolt#7997 -
2540: [planbuilder] More update join table name validation
-
2539: fix
UPDATE IGNORE ... JOIN
fixes: dolthub/dolt#7986 -
2537: Update generated index names to match MySQL
A customer pointed out that when we add indexes with generated names, we don't generate the same names as MySQL. Specifically:- When a FK is added with an explicit constraint name, that name should be used to name the automatically created index, if one is created.
- Secondary indexes are named after the first column in the index in MySQL, not by joining all the columns together.
Customer issue: https://github.com/dolthub/dolt/issues/7960 Dolt PR with test fixes: https://github.com/dolthub/dolt/pull/7974 -
2536: Rename generated FK names when their table is renamed
Updates ourrename table
logic to match MySQL's behavior of updating auto-generated foreign key names to match the new table name.
Customer issue: dolthub/dolt#7959
Dolt companion PR: dolthub/dolt#7968 -
2535: Fix UPDATE JOIN matchedRows
fixes: dolthub/dolt#7957
Main question is how thorough we want to make the child iter check. Should all iterators implement aChildIter
interface? -
2534: Implement row alias expressions (
INSERT ... VALUES (...) AS new_tbl ON DUPLICATE x = new_tbl.x
)
When inserting values, the user can specify names for both the source table and columns which are used inON DUPLICATE
expressions. It looks like either of the below options:INSERT INTO tbl VALUES (1, 2) AS tbl_new ON DUPLICATE KEY b = tbl_new.b; INSERT INTO tbl VALUES (1, 2) AS tbl_new(a_new, b_new) ON DUPLICATE KEY b = b_new;
This replaces the previous (now-deprecated) syntax:
INSERT INTO tbl VALUES (1, 2) ON DUPLICATE KEY b = VALUES(b);
Supporting both syntaxes together was non-trivial because it means there's now two different ways to refer to the same column. While he had an existing way to "redirect" one column name to another, this only worked for unqualified names (no table name), and it overrode the normal name resolution rules, which meant we would fail to detect cases that should be seen as ambiguous.
Previously, we would implement references to the inserted values by using a special table named "__new_ins". I implemented this by keeping that as the default, but using the row alias instead of one was provided. We then create a map from the destination table names to column aliases, and use that map to rewrite expressions that appear inside the VALUES() function. -
2533: Table name validation folds strings
fixes: dolthub/dolt#7958 -
2532: Move
json_function_tests.go
and json tests that depend on it to their own package.
This ensures that non-test code insql/expression/function/json
doesn't depend ontestify
, which is a library that we only want to depend on for tests. -
2531: Bug Fix: Finalize subqueries in
IfConditional
s when applying stored procedures
When applying a stored procedure to aCALL
statement, we weren't callingfinalizeSubqueries()
on any subqueries inIfConditional
expressions, which caused the subquery to not have aNodeExecBuilder
populated.
Customer issue: dolthub/dolt#7944 -
2530: Bug Fix: Index name case-insensitivity
MySQL index names are case-insensitive, but GMS' memory implementation wasn't handling them that way. This makes index names case-insensitive.
Related to dolthub/dolt#7945 -
2529: Fix global decimal.MarshalJSONWithoutQuotes overwrite
Thedecimal.MarshalJSONWithoutQuotes
is a global variable.
By setting this value then this can cause problems with any other code that does not expect this value to be changed.
Instead using a custom encoder to ensure that the marshalling behaviour is as expected without changing the global value ensure that this will not cause compatibility issues with other projects.
This code is covered both by existing tests, and an additional one in this PR.
(if the custom encode switch case is not added, but the global variables are, then the tests fail). -
2528: Bug fix for unwrapping a privileged db
-
2524: Adding
@@max_binlog_size
system variable
https://dev.mysql.com/doc/refman/8.0/en/replication-options-binary-log.html#sysvar_max_binlog_size -
2523: Added additional analyzer hooks for integrators
-
2522: More INSERT short-circuits
Only run an "on update" code block when expressions are non-nil. Directly compare sql mode default string, rather than lowercasing every time. -
2520: Default sql mode for common path
Bit strange & verbose, but has a noticeable effect for small queries.
perf here: dolthub/dolt#7915 -
2519: IndexedTableAccess gets indexing fast path
-
2518: Short circuit for update/delete
Simple updates and deletes skip most of analysis.
perf here: dolthub/dolt#7907 -
2517: Improve correctness and error messages for JSON functions.
MySQL doesn't do this and neither should we.
MySQL:mysql> select JSON_INSERT("null", "$.a", 1); +-------------------------------+ | JSON_INSERT("null", "$.a", 1) | +-------------------------------+ | null | +-------------------------------+ 1 row in set (0.00 sec) mysql> select JSON_INSERT("null", "$.a", 1) is null; +---------------------------------------+ | JSON_INSERT("null", "$.a", 1) is null | +---------------------------------------+ | 0 | +---------------------------------------+
The only time we should be coercing a JSON-null document into SQL-null is for JSON_EXTRACT (for paths other than "$") and JSON_VALUE (for all paths). But these are already handled separately.
-
2515: Zachmu/schemas2 merge
-
2513: Added workflows for checking integrators
This adds a new workflow that runs a subset of tests in Dolt and DoltgreSQL to check for any major integration errors. The workflows do not fail if errors are encountered. Instead, they'll create a comment stating which projects had failures. If no failures were found, then no comment is made. -
2512: Spooling shortcut for one/zero return schemas
Nodes that return zero or one row don't need a beefy channel/wait group setup to execute. They just need to grab the first row and close the iterator. There are several nodes that incorrectly reported their schemas previously, which I've updated to be more accurate. There are some nodes that optionally return rows, which I've simplified to return an empty schema that can be differentiated from the nil schema. We could make the distinction more explicit, also.
bump with perf here: dolthub/dolt#7894 -
2511: Adding mapping to error code 1049 for
ErrDatabaseNotFound
errors
When a database doesn't exist, MySQL returns error code 1049. This change adds a mapping to error code 1049 forErrDatabaseNotFound
errors, and updates our handler so thatComInitDB
messages will map errors to MySQL error codes.
This is needed because tooling (e.g. Pomelo EntityFramework MySQL library) can rely on this error code in application logic.
Related to dolthub/dolt#7890 -
2510: Fix race errors with memory tables
We use this library for running our tests. These are run with the-race
flag - and we are seeing some errors related to concurrency and updating of the tables map.
I've added async.Mutex
to all the places where this map is updated - our tests are now passing :) -
2504: Added InjectedStatement as an AST node
This is the same asInjectedExpr
, except for statements instead of expressions. -
2502: Use Uint32 for SEQ_IN_INDEX in 'SHOW INDEXES' queries.
This is seemingly the correct type for this field.
MySQL Connector/NET expects this for servers >8.0.1: https://github.com/mysql/mysql-connector-net/blob/8.4.0/MySQL.Data/src/SchemaProvider.cs#L298-L300
Fixes #2501 -
2499: fix
LIKE NULL
edge case
This PR fixes an edge case whereSELECT <str> LIKE NULL
should returnNULL
instead offalse
. -
2498: New interfaces for resolving table names for databases with schemas
This is a proof of concept to get schema resolution working quickly, and I'm not super happy with the separation of concerns. A better solution would implement table name resolution in the Catalog directly, rather than in the integrator. That effort is significantly hindered by the Catalog being a concrete analyzer implementation with many analyzer-specific details that can't be easily substituted for another implementation. The longer term plan is to perform the extensive refactoring necessary to make the relevant parts of the Catalog swappable, rather than (effectively) having to swap only DatabaseProvider and friends. -
2497: trim whitespace when converting strings to numbers
fixes dolthub/dolt#7854 -
2495: fix panic in
VALUES
constructor
When the number of rows in a... VALUES ROW(...), ROW(...)
statement were not equal, we would throw a panic.
This PR also unskips some tests that are now fixed.
Companion PR: dolthub/dolt#6849
fixes: dolthub/dolt#6849 -
2494: Replace count star also matches single column pk
-
2493: Implement status variables for
Slow_queries
,Max_used_connections
,Com_select
, andConnections
Adds support for four new status variables:- Slow_queries
- Max_used_connections
- Com_select
- Connections
Note thatConnections
currently only reports the successful connection attempts, but MySQL includes all connection attempts in that status variable. To capture the failed attempts, we'll need to expose that information from the Vitess layer.
Also removes a mutex that was covering the whole scope over all status variables. Now that each individual status variable has a value that uses anatomic
instance, we don't need to synchronize at a larger scope.
Related to dolthub/dolt#7646
-
2492: skip source values analyze when it only contains simple types
-
2491: ValidateInsertColumns avoids allocating hash map
-
2490: Avoid escaping HTML when Marshalling JSON
Due to a misconfiguration, HTML characters were being escaped when marshaling JSON. This is unnecessary, and since we now potentially display marshalled JSON to the user, we shouldn't be doing this. -
2488: System Variables: Add
log_bin
and change the default ofperformance_schema
Thelog_bin
system variable controls whether a MySQL server logs to the binary log or not.
Theperformance_schema
system variable was previously defaulted to1
, to match MySQL's default, but this can cause tools (e.g. Datadog) to believe that theperformance_schema
system tables are available, and then error out when trying to query them. Since we don't provide aperformance_schema
database, the new default for theperformance_schema
system variable is0
. -
2487: Expand literals in comparisons when safe
-
2486: add parser interface in engine
This PR createssql.Parser
interface. This interface is defined in the engine and it should be used rather than using mysql parser directly.
AddedGlobalParser
variable to expose Doltgres parser for parsing view definition for now. It can also be used in places that needs doltgres-specific syntax parsing. -
2485: Have LazyJSONDocument implement fmt.Stringer and driver.Valuer, in order to interoperate with other go SQL libraries.
-
2482: don't check schema compatibility for
select into
fixes: dolthub/dolt#7781
-
2481: aliasExpr inputExpression case-insensitive check
This PR adds case-insensitive check foraliasExpr.InputExpression
andexpr.String()
to determine the expr is referencable. -
2473: remove re-parsing select stmt in create view stmt
Depends on dolthub/vitess#339 -
2470: Add
LazyJSONDocument
, which wraps a JSON string and only deserializes it if needed.
This is the GMS side of dolthub/dolt#7749
This is a newJSONWrapper
implementation. It isn't used by the GMS in-memory storage, but it will be used in Dolt to speed upSELECT
queries that don't care about the structure of the JSON.
A big difference between this andJSONDocument
is that even after it de-serializes the JSON into a go value, it continues to keep the string in memory. This is good in cases where we would want to re-serialize the JSON later without changing it. (So statements likeSELECT json FROM table WHERE json->>"$.key" = "foo";
will still be faster.) But with the downside of using more memory thanJSONDocument
) -
2469: refactor index validation and prevent indexes over json columns
This PR consolidates the logic to validate if an index.
Additionally, it fixes a bug wherecreate table t (i int, index (i, i));
was allowed.
fixes: dolthub/dolt#6064 -
2468: Remove session mutex, status variables atomic inc
Avoid concurrent ctx use in doQuery. I standardized the session status variables to uint64 to make it easier to have them be atomics. The return types for these variables in MySQL seem to be strings, so switching them from int640->uint64 seems OK. We could always change the presentation layer later. Removing the session mutex appears to be safe after these changes. -
2467: improve
DESCRIBE
columns and auto_increment bugs
This PR makes it so our DESCRIBE statement is more correct when there are unique keys.
There's an edge case we miss:create table t (i int not null, j int not null, unique key (j), unique key (i)); describe t;
In MySQL,
UNIQUE KEY j
is created beforeUNIQUE KEY i
, so describe makesj
asPRI
.
In Dolt, we store indexes in a map and return them in order of index name, so we marki
asPRI
.
There are skipped tests for this
Additionally, this adds theNOT NULL
constraint to columns that are markedAUTO_INCREMENT
.
partially addresses: dolthub/dolt#2289 -
2466: Schema-qualified table names
This PR also fixes a couple unrelated issues:- IMDB query plans are brought up to date (this is most of the change lines)
- Fixed bugs in certain show statements (information_schema tests)
-
2465: projection schema finds default values
fixes: dolthub/dolt#6016 -
2464: skipping auto_increment on error tests
I was wrong, this is very broken in dolt.
Could not find a quick fix, so skipping tests to unblock auto-bumps.
reopens: dolthub/dolt#3157 -
2463: Update GMS to detect INSERT statements with row alias and return error.
We parse these statements but don't yet support them. So for now we return a helpful error. -
2461: tests for
auto_inc
with error
Hard to tell which PR or combination or PRs fixed this, but this issue no longer reproduces in GMS or Dolt.
This PR just adds an explicit test case for it.
fixes dolthub/dolt#3157 -
2460: implement
last_days()
mysql docs:
https://dev.mysql.com/doc/refman/8.0/en/date-and-time-functions.html#function_last-day -
2458: add table comments to
information_schema.tables
fixes dolthub/dolt#2389 -
2456: support
to_days()
andfrom_days()
This PR implements theTO_DAYS
andFROM_DAYS
MySQL functions.
Initially, I attempted to use the existing logic from Add/Sub Date along with the 0 year, but there were overflow issues.
So, there's a skipped test for Intervals.
to_days
docs:
https://dev.mysql.com/doc/refman/8.0/en/date-and-time-functions.html#function_to-days
from_days
docs:
https://dev.mysql.com/doc/refman/8.0/en/date-and-time-functions.html#function_from-days -
2455: Feature:
gtid_subtract()
function
Adds support for MySQL'sgtid_subtract()
built-in function:
https://dev.mysql.com/doc/refman/8.0/en/gtid-functions.html#function_gtid-subtract -
2452: Adding support for the
SHOW BINARY LOG STATUS
statement
Adds support for handling MySQL'sSHOW BINARY LOG STATUS
statement.
As with the other replication commands, we test the privilege checks in GMS and test the actual integration for data flowing through this statement in Dolt. -
2451: Bump google.golang.org/protobuf from 1.28.1 to 1.33.0
Bumps google.golang.org/protobuf from 1.28.1 to 1.33.0.
You can trigger a rebase of this PR by commenting@dependabot rebase
.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/dolthub/go-mysql-server/network/alerts). -
2449: Bug fix:
SHOW VARIABLES LIKE
should be case-insensitive
Bug fix to makeSHOW VARIABLES LIKE '...'
match MySQL's behavior by matching patterns case-insensitively.
Also includes changes to add the missingbinlog_format
system variable that some replication clients need to query, and initializes theserver_uuid
system variable. -
2448: use SubStatementStr field in DDL for getting sub statement in CREATE VIEW
This PR allows using defined string for sub statement instead of slicing the original query. If this field is empty, then slice the original query to get the sub statement. -
2446: support
DECLARE
inBEGIN...END BLOCK
inTRIGGER
This PR allows us to useDECLARE
statements inTRIGGERS
.
The analyzer ruleapplyTriggers
was altered to initializeProcedureReference
forTriggerBeginEndBlock
.
The important part was ensuring that all relevant nodes (TriggerBeingEndBlock
,BeginEndBlock
,ProcedureParam
) all had the sameProcedureReference
and to search in all the nodes.
Additionally,nil
guards are added to all receiver methods forProcedureReference
to prevent panics.
It seems like events might have this issue, but there's a banaid fix for it. Not sure if I want to touch that code.
fixes: dolthub/dolt#7720 -
2445: updating declare in trigger error to be more descriptive
-
2443: docs: very minor grammar fixes in README.md
Hi, I just wanted to fix a couple of minor grammatical errors in the README.md file. -
2442: prevent panic on triggers with declare statements
We're reusing a code from stored procedures to handle begin end blocks, but we're missing a lot of set up that prevents certain variables from being nil. Consequently, we panic in a couple places.
This PR fixes some of those panics, but reveals other problems we have in resolving/executing triggers of this format.
Partially addresses: dolthub/dolt#7720 -
2440: support
ALTER TABLE ... RENAME CONSTRAINT ...
for foreign key constraints
This PR adds support forALTER TABLE ... RENAME CONSTRAINT ...
for foreign key constraints.
This is a feature that is NOT supported in MySQL, but we're adding it to make it easier to resolve merge conflicts resulting from foreign key name collisions.
related: #2438 -
2439: For AutoIncrement lock modes other than "interleaved", hold the lock for the duration of the insert iter.
This is the GMS side of dolthub/dolt#7634
This PR changes the engine to make it acquire a lock (provided by the storage layer) wheninnodb_autoinc_lock_mode
is set to the "consecutive" or "traditional" values. More specifically, it calls a new function in theAutoIncrementSetter
interface which optionally acquires a lock and returns a callback for releasing the lock.
The in-memory DB doesn't have multithreading support, so when just using GMS, this is a no-op. A followup PR in Dolt will update its implementation ofAutoIncrementSetter
to handle the locking. -
2438: have generated index and foreign key names match mysql
changes:- auto-generated secondary index names don't match (after a conflict)
- we started at 0, while mysql starts at 2
- auto-generate foreign key names in a way that matches mysql
- in-memory used to just have empty string as the constraint name
we used to error when generating foreign keys over the same sets of columns, but not anymore
related: dolthub/dolt#7650
-
2429: server trace time includes parsing
vitess
- 390: Minor bug fixes for
caching_sha2_password
auth logic
For accounts without passwords, we need to account for the client sending the null byte when the server re-requests the client auth data, and then skip theAuthMoreDataPacket
, andCachingSha2FastAuth
packets. Otherwise themysql
client errors with "Malformed packet".
handleConnectionError
is used to report stats about failed connection attempts, but wasn't being called in the correct spot. The previous spot was over counting failed connection attempts, since it was called as part of the auth renegotiation flow. It has been moved to be called whenever we return an error or exit the function without a successful connection. - 389: syntax support for custom
explain plan
For debugging purposes, we replaced MySQL'sEXPLAIN
output with our very own.
Unfortunately, it is not causing problems, so we must move our syntax into its own thing. - 387: Add support for serializing/deserializing
caching_sha2_password
auth strings - 386: parse
io_threads
andsql_state
as no-ops
related: dolthub/dolt#8592 - 384: add mariadb table opts and tests
This PR adds support for parsing the mariadb table options listed here:
https://mariadb.com/kb/en/create-table/#table-options
They are all no-ops (much like many of the MySQL table options)
related: dolthub/dolt#8592 - 383: parse
start transaction with consistent
as no-op
related: dolthub/dolt#8592 - 382: Add syntax support to allow
CREATE USER
statements to quote the auth plugin
MySQL allows the auth plugin name to be quoted or unquoted, but our parser previously only supported the unquoted mode for most forms ofCREATE USER
. This change adds support for the following forms:CREATE USER user@localhost IDENTIFIED WITH 'auth_plugin';
CREATE USER user@localhost IDENTIFIED WITH 'auth_plugin' BY random password;
CREATE USER user@localhost IDENTIFIED WITH 'auth_plugin' BY 'password';
- 381: Add support for
show slave status
This syntax is deprecated and will be removed in future MySQL versions (SHOW REPLICA STATUS
should be preferred instead). However, some tools (e.g. Dolphie, MyDumper) still rely on this deprecated syntax, so we're adding support for it to keep compatibility with those tools. - 379: Refactor
AuthServer
to support multiple authentication methods
In preparation of supportingcaching_sha2_password
authentication, this change applies the refactoring from vitessio/vitess#8503 to our fork of Vitess so that we can more easily support multiple authentication methods.
Related to: dolthub/dolt#8496 - 378: support
set names binary
syntax (no quotes)
syntax for: dolthub/dolt#8574 - 377: Fixed CTE issue with auth
Fixes the test introduced in: - 375: support multiple like tables
- 374: USE Revisions use proper auth
- 373: Fixed printing bug in AliasedExpr node (was calling wrong print metho…
…d, which doesn't work with custom formatters) - 372: Embedded Authorization
- 371: support quoted character set values
This PR adds syntax support for quoted character set values.
fix: dolthub/dolt#8455 - 370: Collapse union types
Replace union type with one interface type. Static type access in reducer stack become runt-time interface conversions. The compiler builder loses the ability to do type checking at build time, so type safety has to be checked with testing.
Additional type enforcements are needed for nil-safety. Nil return values have to be typed correctly in the interface variable for casts up the stack to pass. Interface types do not have default nil values, so I've addedtryCastXXX
helper functions to accommodate untyped nils. - 369: Added schema name to show table opts
No way to populate this field from the parser directly, must be set manually when generating the AST via other means. - 367: Adding AST support for setting/dropping column attributes
AST support for altering type and not null constraint, without having to respecify the full column definition. Needed to support Postgres' more modularALTER TABLE
syntax. - 366: allow validate password variables
- 365: parse vector index syntax
Supports the following syntax:create table t (v blob, vector index(v))
create vector index vec on t(v)
alter table t add vector index vec on t(v)
- 363: support syntax for get_format
MySQL Docs: https://dev.mysql.com/doc/refman/8.4/en/date-and-time-functions.html#function_get-format - 362: parse
SET
statements forLOAD INFILE
syntax for dolthub/dolt#8307 - 361: adding
instant
as non-reserved keyword
TheINSTANT
keyword isn't in the MySQL docs, but it is a non reserved keyword.
MySQL Docs: https://dev.mysql.com/doc/refman/8.4/en/keywords.html
fixes: dolthub/dolt#8220 - 360: Bug fix: Preserve sign for integers in prepared statements
Bound integer values for prepared statements are parsed from the wire and packaged intoint64
values that are then passed to the SQL engine to execute with the prepared statement. Forint8
,int16
,int24
, andint32
types those bytes from the wire weren't getting cast to the correct type first, before they were cast toint64
, which meant if the signed bit was set, the value was interpreted incorrectly.
Customer issue: dolthub/dolt#8085 - 359: fix detection of multi-statements in
ComPrepare
Currently, preparing multi-statements is not supported; so we can't prepare a query likeselect ?; select ?;
.
However, the check for this condition just looked for any characters after the first;
, which meant that queries likeselect ?; \n
would incorrectly throw an error.
This was made apparent using the Prisma ORM, which runs the query:The above query ends in a newline character.SELECT TABLE_NAME AS view_name, VIEW_DEFINITION AS view_sql FROM INFORMATION_SCHEMA.VIEWS WHERE TABLE_SCHEMA = ?;
The fix is to useSplitStatementToPieces()
, which trims these white space characters, and check if there's exactly one piece; this was taken from the vitessio repo: https://github.com/vitessio/vitess/blob/main/go/mysql/conn.go#L1204
fixes dolthub/dolt#8157 - 358: Feature: parser support for
PURGE BINARY LOGS
syntax
https://dev.mysql.com/doc/refman/8.4/en/purge-binary-logs.html - 357: Bug fix: Send an error response when the server fails to handle
COM_BINLOG_DUMP_GTID
A MySQL primary needs to be able to send back an error response when handling theCOM_BINLOG_DUMP_GTID
command. Previously, when the integrator returned an error, it was logged in the primary server logs, but it was not being sent back to the replica who sent the command. This change causes an error packet to be sent to the replica, containing the details of the error the integrator returned.
This change is difficult to test in isolation, but I have tests indolt
that will exercise this codepath. - 356: Bug fix: Off-by-one error when parsing multiple statements
An off-by-one error in multistatement parsing prevents us from parsing multistatements without a space between the delimiter and the next statement. For example: "SELECT 1;SELECT 2;" would previously be parsed as "SELECT 1;S" and "ELECT 2;".
Found while testing changes for dolthub/driver#28 - 355: New functions to create PreviousGtids events, and to update event checksum
- 354: support
VALUES
statement
This PR add syntax support forVALUES
statment as an alias forSELECT * FROM ...
.
We are still missingSELECT (VALUES ...)
(support for values as aselect_expression
).
syntax for dolthub/dolt#8012 - 353: allow backticks in system and user variables
This PR allows the use of backticks in system and user variables.
We are more lenient than MySQL when it comes to backticks in set statements.
For example, we allowset @abc.
def= 10
, while MySQL throws an error.
This is because we treat this as a qualified column identifer and automatically strip the backticks.
test bump #2548
fixes dolthub/dolt#8010 - 352: Add support for the
CONSTRAINT
keyword when adding a foreign key without a constraint name
Customer issue: dolthub/dolt#8008 - 351: Add
ConnectionAborted()
callback toHandler
interface
In order to support theAborted_connects
status variable, GMS needs to be notified when a connection attempt is aborted in the Vitess layer. This change adds aConnectionAborted()
callback method to Vitess'Handler
interface and calls it whenever a connection attempt errors out before it's fully established.
Coordinated with #2546 - 350: Refactoring
BinlogStream
type intoBinlogMetadata
Themysql.BinlogStream
type from Vitess was a little awkward to use, and seems to have been mostly intended as test code. This gives it a more descriptive name and makes it a little easier to pass around struct copies without concurrency issues from a shared instance. - 349: Fixed timestamp bindvar formatting to match MySQL string expectation
- 348: Allowing caching plugin to be specified in string quotes
TheCREATE USER ... IDENTIFIED WITH
syntax (MySQL ref) allows the caching plugin to be specified in string quotes, but our parser only supported identifier quotes.
This came up as part of binlog replication testing – MySQL was sending aCREATE USER
statement from the primary to a Dolt replica, but Dolt wasn't able to parse the statement because of the use of string quotes around the caching plugin name. - 347: Added InjectedStatement
This is the same asInjectedExpr
, except for statements instead of expressions. - 346: support
DATE
,TIME
, andTIMESTAMP
literal parsing
The SQL standard has special syntax for parsing date, time, and timestring literals.
https://dev.mysql.com/doc/refman/8.0/en/date-and-time-literals.html
This PR adds support for that.
Code was mostly taken from vitessio.
The types are still left as string types, as type conversion later on handles it just fine. - 345: parse type aliases in cast
add support for statements like:select cast(<str> as character)
select cast(<str> as double precision)
select cast(<str> as read)
- 344: make
row
optional in VALUES constructor and insert statement
This PR adds additional syntax support for VALUE constructor.
fixes dolthub/dolt#6849
fixes dolthub/dolt#7853 - 343: Exposing binary JSON encoding to SQL expression logic for testing
Exposing logic in Vitess that decodes MySQL's internal, binary JSON serialization encoding, so that we can use it to test JSON serialization. - 342: refactor and fully implement
alter table
syntax
MySQL Docs; https://dev.mysql.com/doc/refman/8.0/en/alter-table.html
fixes: #2472 - 340: improving partition parsing
partially addresses: #2472 - 339: capture select expressions that are sub statements within other expressions
- 338: Add a schema qualifier to table names
- 337: Consistently using pointer to AliasedValues in InsertRows interface, never values.
Once again, golang reminds me that a value type implementing an interface forces the pointer type to also implement the interface, and mixing the two messes up our runtime type reflection.
I changed all uses of the AliasedValues in InsertRows interface to be pointers so that we can interact with them consistently on the GMS side. - 336: Adding support for subtracting GTIDSets
Needed to support thegtid_subtract()
built-in function. Also exposes theParseMysql56GTIDSet
function so that we can parse GTIDSet strings from GMS.
Related GMS PR: #2455 - 334: Parser support for
FLUSH TABLES
statement
Adds parser support for theFLUSH TABLES <tablename_list> [WITH READ LOCK]
statement.
Also addsSHOW MASTER STATUS
as an alias forSHOW BINARY LOG STATUS
for compatibility with clients that use the older syntax (e.g. Debezium). - 333: Add support for parsing row and column aliases on insert statements.
Related to dolthub/dolt#7638
This allows us to parse statements of the form:
INSERT INTO t VALUES (?, ?, ?) AS new ON DUPLICATE KEY v = new.v;
- 332: add SubStatementStr for DDL struct
- 331: add check option to create view
This PR adds optionalWITH CHECK OPTION
support forCREATE VIEW
- 330: support rename constraint syntax
This PR adds syntax support forALTER TABLE ... RENAME CONSTRAINT [FOREIGN KEY / CHECK]...
for foreign key constraints. - 329: Changes to binlog event creation functions
- Exposing the
Length()
function in the BinlogEvent interface so calling code can access the event size present in a binlog event's header. Needed for calculating binlog file position. - Renaming
FakeBinlogStream
→BinlogStream
so calling code can use it when serializing binlog events.
- Exposing the
Closed Issues
- 2740: Anomaly of account replication from MySQL instance
- 2701: lowercase sql mode values are not supported
- 2671: setting two foreign keys to associate with the same field, deleting data is normal but does not take effect
- 2666: STR_TO_DATE cannot parse "%Y%m%d" format
- 2501: Problems with MySQL Connector/NET (Mysql.Data) and go-mysql-server
- 2503: "ON UPDATE CURRENT_TIMESTAMP" not come into effect
- 2472: Parser support for
PARTITION BY
- 2447: Any Benchmarks available
- 2364: Panic when executing sql statements from external command line tool after startup