-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Open
Description
Overview of the Issue
There's quite a few queries where we join vdiff
and vdiff_tables
using a condition like vdiff.id = vdiff_tablet.vdiff_id
.
Unfortunately, the data types between these two columns does not match (taken from release-21.0
, but this is similar on main
):
CREATE TABLE IF NOT EXISTS vdiff
(
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`vdiff_uuid` varchar(64) NOT NULL,
`workflow` varbinary(1024) DEFAULT NULL,
`keyspace` varbinary(256) DEFAULT NULL,
`shard` varchar(255) NOT NULL,
`db_name` varbinary(1024) DEFAULT NULL,
`state` varbinary(64) DEFAULT NULL,
`options` json DEFAULT NULL,
`created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`started_at` timestamp NULL DEFAULT NULL,
`liveness_timestamp` timestamp NULL DEFAULT NULL,
`completed_at` timestamp NULL DEFAULT NULL,
`last_error` varbinary(1024) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `uuid_idx` (`vdiff_uuid`),
KEY `state` (`state`),
KEY `ks_wf_idx` (`keyspace`(64), `workflow`(64))
) ENGINE = InnoDB CHARSET = utf8mb4
CREATE TABLE IF NOT EXISTS vdiff_table
(
`vdiff_id` varchar(64) NOT NULL,
`table_name` varbinary(128) NOT NULL,
`state` varbinary(64) DEFAULT NULL,
`lastpk` varbinary(2000) DEFAULT NULL,
`table_rows` bigint(20) NOT NULL DEFAULT '0',
`rows_compared` bigint(20) NOT NULL DEFAULT '0',
`mismatch` tinyint(1) NOT NULL DEFAULT '0',
`report` json DEFAULT NULL,
`created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`updated_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`vdiff_id`, `table_name`)
) ENGINE = InnoDB CHARSET = utf8mb4
This means we end up joining a bigint
column against a varchar
column, which prevents MySQL from using an index to do the join, and we end up having to scan the full vdiff_table
contents to perform this join.
This causes performance issues, and some of these queries can even fail with errors like Out of sort memory, consider increasing server sort buffer size (errno 1038)
if there's additional ORDER BY
clauses on the queries.
Reproduction Steps
n/a
Binary Version
n/a
Operating System and Environment details
n/a
Log Fragments
n/a