Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changing ENUM value corrupts row using changed valued #920

Open
bolt-leandro opened this issue Jan 28, 2021 · 6 comments
Open

Changing ENUM value corrupts row using changed valued #920

bolt-leandro opened this issue Jan 28, 2021 · 6 comments

Comments

@bolt-leandro
Copy link

Hello!

Attempting to change ENUM values results in wrong data for the value changed. Test case is below. Feel free to ask for any more information you might need.

master [localhost] {msandbox} (test) > SHOW VARIABLES LIKE '%version%';
+-------------------------+----------------------------------------------------+
| Variable_name           | Value                                              |
+-------------------------+----------------------------------------------------+
| innodb_version          | 5.7.32-35                                          |
| protocol_version        | 10                                                 |
| slave_type_conversions  |                                                    |
| tls_version             | TLSv1,TLSv1.1,TLSv1.2                              |
| version                 | 5.7.32-35-log                                      |
| version_comment         | Percona Server (GPL), Release 35, Revision 5688520 |
| version_compile_machine | x86_64                                             |
| version_compile_os      | Linux                                              |
| version_suffix          | -log                                               |
+-------------------------+----------------------------------------------------+

master [localhost] {msandbox} (test) > SHOW CREATE TABLE grades\G
*************************** 1. row ***************************
       Table: grades
Create Table: CREATE TABLE `grades` (
  `id` int(11) NOT NULL,
  `Name` varchar(255) NOT NULL,
  `Result` enum('Pass','Fail') NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

master [localhost] {msandbox} (test) > INSERT INTO grades(id, name, result) VALUES(101,'Aarav','Pass');
Query OK, 1 row affected (0.01 sec)

master [localhost] {msandbox} (test) > INSERT INTO grades(id, name, result) VALUES(102,'Yashraj','Fail');
Query OK, 1 row affected (0.01 sec)

master [localhost] {msandbox} (test) > SELECT * FROM grades;
+-----+---------+--------+
| id  | Name    | Result |
+-----+---------+--------+
| 101 | Aarav   | Pass   |
| 102 | Yashraj | Fail   |
+-----+---------+--------+
2 rows in set (0.00 sec)

HERE WE RUN GH-OST (more verbose logs below)

--alter="MODIFY COLUMN Result enum('Pass','MostExcellent') NOT NULL" \

NOW WE CHECK RESULTS AFTER CUTOVER AND SEE SOMETHING IS WRONG WITH OUR ENUM

master [localhost] {msandbox} (test) > SELECT * FROM grades;
+-----+---------+--------+
| id  | Name    | Result |
+-----+---------+--------+
| 101 | Aarav   | Pass   |
| 102 | Yashraj |        |
+-----+---------+--------+
2 rows in set (0.00 sec)

master [localhost] {msandbox} (test) > SELECT * FROM _grades_del;
+-----+---------+--------+
| id  | Name    | Result |
+-----+---------+--------+
| 101 | Aarav   | Pass   |
| 102 | Yashraj | Fail   |
+-----+---------+--------+
2 rows in set (0.00 sec)

GH-OST COMMAND DETAILS (ALSO TRIED 1.1.0 AND GOT SAME RESULTS)

root@ubuntu-bionic:~# gh-ost --version
1.0.49

root@ubuntu-bionic:~# cat grades.sh
gh-ost \
--max-load=Threads_running=75 \
--critical-load=Threads_running=1000 \
--critical-load-interval-millis=10000 \
--critical-load-hibernate-seconds=300 \
--chunk-size=1000 \
--max-lag-millis=1000 \
--user="msandbox" \
--password="msandbox" \
--host=127.0.0.1 \
--port=23209 \
--database="test" \
--table="grades" \
--verbose \
--alter="MODIFY COLUMN Result enum('Pass','MostExcellent') NOT NULL" \
--cut-over=default \
--default-retries=120 \
--panic-flag-file=/tmp/ghost.panic.flag \
--initially-drop-ghost-table \
--initially-drop-old-table \
--exact-rowcount \
--cut-over-lock-timeout-seconds=10 \
--hooks-path=/root/ghost_hooks/ \
--execute

root@ubuntu-bionic:~# sh -x grades.sh
+ gh-ost --max-load=Threads_running=75 --critical-load=Threads_running=1000 --critical-load-interval-millis=10000 --critical-load-hibernate-seconds=300 --chunk-size=1000 --max-lag-millis=1000 --user=msandbox --password=msandbox --host=127.0.0.1 --port=23209 --database=test --table=grades --verbose --alter=MODIFY COLUMN Result enum('Pass','MostExcellent') NOT NULL --cut-over=default --default-retries=120 --panic-flag-file=/tmp/ghost.panic.flag --initially-drop-ghost-table --initially-drop-old-table --exact-rowcount --cut-over-lock-timeout-seconds=10 --hooks-path=/root/ghost_hooks/ --execute
2021-01-28 12:11:20 INFO starting gh-ost 1.0.49
2021-01-28 12:11:20 INFO Migrating `test`.`grades`
2021-01-28 12:11:20 INFO connection validated on 127.0.0.1:23209
2021-01-28 12:11:20 INFO User has ALL privileges
2021-01-28 12:11:20 INFO binary logs validated on 127.0.0.1:23209
2021-01-28 12:11:20 INFO Restarting replication on 127.0.0.1:23209 to make sure binlog settings apply to replication thread
2021-01-28 12:11:20 INFO Inspector initiated on ubuntu-bionic:23209, version 5.7.32-35-log
2021-01-28 12:11:20 INFO Table found. Engine=InnoDB
2021-01-28 12:11:20 INFO Estimated number of rows via EXPLAIN: 2
2021-01-28 12:11:20 INFO Recursively searching for replication master
2021-01-28 12:11:20 INFO Master found to be 127.0.0.1:23207
2021-01-28 12:11:20 INFO log_slave_updates validated on 127.0.0.1:23209
2021-01-28 12:11:20 INFO connection validated on 127.0.0.1:23209
2021-01-28 12:11:20 INFO Connecting binlog streamer at mysql-bin.000001:1535583
[2021/01/28 12:11:20] [info] binlogsyncer.go:133 create BinlogSyncer with config {99999 mysql 127.0.0.1 23209 msandbox    false false <nil> false UTC true 0 0s 0s 0 false}
[2021/01/28 12:11:20] [info] binlogsyncer.go:354 begin to sync binlog from position (mysql-bin.000001, 1535583)
[2021/01/28 12:11:20] [info] binlogsyncer.go:203 register slave for master server 127.0.0.1:23209
2021-01-28 12:11:20 INFO rotate to next log from mysql-bin.000001:0 to mysql-bin.000001
[2021/01/28 12:11:20] [info] binlogsyncer.go:723 rotate to (mysql-bin.000001, 1535583)
2021-01-28 12:11:20 INFO connection validated on 127.0.0.1:23207
2021-01-28 12:11:20 INFO connection validated on 127.0.0.1:23207
2021-01-28 12:11:20 INFO will use time_zone='SYSTEM' on applier
2021-01-28 12:11:20 INFO Examining table structure on applier
2021-01-28 12:11:20 INFO Applier initiated on ubuntu-bionic:23207, version 5.7.32-35-log
2021-01-28 12:11:20 INFO Dropping table `test`.`_grades_gho`
2021-01-28 12:11:20 INFO Table dropped
2021-01-28 12:11:20 INFO Dropping table `test`.`_grades_del`
2021-01-28 12:11:20 INFO Table dropped
2021-01-28 12:11:20 INFO Dropping table `test`.`_grades_ghc`
2021-01-28 12:11:20 INFO Table dropped
2021-01-28 12:11:20 INFO Creating changelog table `test`.`_grades_ghc`
2021-01-28 12:11:20 INFO Changelog table created
2021-01-28 12:11:20 INFO Creating ghost table `test`.`_grades_gho`
2021-01-28 12:11:21 INFO Ghost table created
2021-01-28 12:11:21 INFO Altering ghost table `test`.`_grades_gho`
2021-01-28 12:11:21 INFO Ghost table altered
2021-01-28 12:11:21 INFO Waiting for ghost table to be migrated. Current lag is 0s
2021-01-28 12:11:22 INFO Intercepted changelog state GhostTableMigrated
2021-01-28 12:11:22 INFO Handled changelog state GhostTableMigrated
2021-01-28 12:11:22 INFO Chosen shared unique key is PRIMARY
2021-01-28 12:11:22 INFO Shared columns are id,Name,Result
2021-01-28 12:11:22 INFO Listening on unix socket file: /tmp/gh-ost.test.grades.sock
2021-01-28 12:11:22 INFO As instructed, counting rows in the background; meanwhile I will use an estimated count, and will update it later on
2021-01-28 12:11:22 INFO As instructed, I'm issuing a SELECT COUNT(*) on the table. This may take a while
2021-01-28 12:11:22 INFO Migration min values: [101]
2021-01-28 12:11:22 INFO Exact number of rows via COUNT: 2
2021-01-28 12:11:22 INFO Migration max values: [102]
2021-01-28 12:11:22 INFO Waiting for first throttle metrics to be collected
2021-01-28 12:11:22 ERROR parsing time "" as "2006-01-02T15:04:05.999999999Z07:00": cannot parse "" as "2006"
2021-01-28 12:11:22 INFO First throttle metrics collected
# Migrating `test`.`grades`; Ghost table is `test`.`_grades_gho`
# Migrating ubuntu-bionic:23207; inspecting ubuntu-bionic:23209; executing on ubuntu-bionic
# Migration started at Thu Jan 28 12:11:20 +0000 2021
# chunk-size: 1000; max-lag-millis: 1000ms; dml-batch-size: 10; max-load: Threads_running=75; critical-load: Threads_running=1000; nice-ratio: 0.000000
# throttle-additional-flag-file: /tmp/gh-ost.throttle
# panic-flag-file: /tmp/ghost.panic.flag
# Serving on unix socket: /tmp/gh-ost.test.grades.sock
Copy: 0/2 0.0%; Applied: 0; Backlog: 0/1000; Time: 2s(total), 0s(copy); streamer: mysql-bin.000001:1537706; Lag: 0.00s, State: migrating; ETA: N/A
2021-01-28 12:11:23 INFO Row copy complete
Copy: 0/2 0.0%; Applied: 0; Backlog: 0/1000; Time: 3s(total), 1s(copy); streamer: mysql-bin.000001:1542129; Lag: 0.09s, State: migrating; ETA: N/A
Copy: 2/2 100.0%; Applied: 0; Backlog: 0/1000; Time: 3s(total), 1s(copy); streamer: mysql-bin.000001:1542129; Lag: 0.09s, State: migrating; ETA: due
2021-01-28 12:11:23 INFO Grabbing voluntary lock: gh-ost.64.lock
2021-01-28 12:11:23 INFO Setting LOCK timeout as 20 seconds
2021-01-28 12:11:23 INFO Looking for magic cut-over table
2021-01-28 12:11:23 INFO Creating magic cut-over table `test`.`_grades_del`
2021-01-28 12:11:23 INFO Magic cut-over table created
2021-01-28 12:11:23 INFO Locking `test`.`grades`, `test`.`_grades_del`
2021-01-28 12:11:23 INFO Tables locked
2021-01-28 12:11:23 INFO Session locking original & magic tables is 64
2021-01-28 12:11:23 INFO Writing changelog state: AllEventsUpToLockProcessed:1611835883403923151
2021-01-28 12:11:23 INFO Waiting for events up to lock
2021-01-28 12:11:23 INFO Intercepted changelog state AllEventsUpToLockProcessed
2021-01-28 12:11:23 INFO Handled changelog state AllEventsUpToLockProcessed
2021-01-28 12:11:24 INFO Waiting for events up to lock: got AllEventsUpToLockProcessed:1611835883403923151
2021-01-28 12:11:24 INFO Done waiting for events up to lock; duration=732.651539ms
# Migrating `test`.`grades`; Ghost table is `test`.`_grades_gho`
# Migrating ubuntu-bionic:23207; inspecting ubuntu-bionic:23209; executing on ubuntu-bionic
# Migration started at Thu Jan 28 12:11:20 +0000 2021
# chunk-size: 1000; max-lag-millis: 1000ms; dml-batch-size: 10; max-load: Threads_running=75; critical-load: Threads_running=1000; nice-ratio: 0.000000
# throttle-additional-flag-file: /tmp/gh-ost.throttle
# panic-flag-file: /tmp/ghost.panic.flag
# Serving on unix socket: /tmp/gh-ost.test.grades.sock
Copy: 2/2 100.0%; Applied: 0; Backlog: 1/1000; Time: 4s(total), 1s(copy); streamer: mysql-bin.000001:1547960; Lag: 0.08s, State: migrating; ETA: due
Copy: 2/2 100.0%; Applied: 0; Backlog: 0/1000; Time: 4s(total), 1s(copy); streamer: mysql-bin.000001:1547960; Lag: 0.08s, State: migrating; ETA: due
2021-01-28 12:11:24 INFO Setting RENAME timeout as 10 seconds
2021-01-28 12:11:24 INFO Session renaming tables is 63
2021-01-28 12:11:24 INFO Issuing and expecting this to block: rename /* gh-ost */ table `test`.`grades` to `test`.`_grades_del`, `test`.`_grades_gho` to `test`.`grades`
2021-01-28 12:11:24 INFO Found atomic RENAME to be blocking, as expected. Double checking the lock is still in place (though I don't strictly have to)
2021-01-28 12:11:24 INFO Checking session lock: gh-ost.64.lock
2021-01-28 12:11:24 INFO Connection holding lock on original table still exists
2021-01-28 12:11:24 INFO Will now proceed to drop magic table and unlock tables
2021-01-28 12:11:24 INFO Dropping magic cut-over table
2021-01-28 12:11:24 INFO Releasing lock from `test`.`grades`, `test`.`_grades_del`
2021-01-28 12:11:24 INFO Tables unlocked
2021-01-28 12:11:24 INFO Tables renamed
2021-01-28 12:11:24 INFO Lock & rename duration: 894.479131ms. During this time, queries on `grades` were blocked
2021-01-28 12:11:24 INFO Looking for magic cut-over table
[2021/01/28 12:11:24] [info] binlogsyncer.go:164 syncer is closing...
[2021/01/28 12:11:24] [error] binlogstreamer.go:77 close sync with err: sync is been closing...
2021-01-28 12:11:24 INFO Closed streamer connection. err=<nil>
2021-01-28 12:11:24 INFO Dropping table `test`.`_grades_ghc`
[2021/01/28 12:11:24] [info] binlogsyncer.go:179 syncer is closed
2021-01-28 12:11:24 ERROR Error 1146: Table 'test._grades_ghc' doesn't exist
2021-01-28 12:11:24 INFO Table dropped
2021-01-28 12:11:24 INFO Am not dropping old table because I want this operation to be as live as possible. If you insist I should do it, please add `--ok-to-drop-table` next time. But I prefer you do not. To drop the old table, issue:
2021-01-28 12:11:24 INFO -- drop table `test`.`_grades_del`
2021-01-28 12:11:24 INFO Done migrating `test`.`grades`
2021-01-28 12:11:24 INFO Removing socket file: /tmp/gh-ost.test.grades.sock
2021-01-28 12:11:24 INFO Tearing down inspector
2021-01-28 12:11:24 INFO Tearing down applier
2021-01-28 12:11:24 INFO Tearing down streamer
2021-01-28 12:11:24 INFO Tearing down throttler
# Done
@timvaillancourt
Copy link
Collaborator

@bolt-leandro this should be resolved by #963, which is currently on master

Can you test with master and confirm? Thanks

@bolt-leandro
Copy link
Author

@timvaillancourt I tested again with version 1.1.4 (sorry about late reply, missed notification) and still getting the same behaviour. Changing the ENUM values results in empty enum value both on 5.7.32-35 and 8.0.22-13 (attached similar results)
gh-ost_920_1.1.4.txt

@timvaillancourt
Copy link
Collaborator

timvaillancourt commented Sep 7, 2022

@bolt-leandro thanks, I've started to create a localtest that reproduces this in #1177:

Testing: enum-modified......ERROR enum-modified: checksum mismatch
---
2c2
< 2	8	Fail
---
> 2	8	
diff /tmp/gh-ost-test.orig.content.csv /tmp/gh-ost-test.ghost.content.csv

I'll take a stab at fixing this whenever I can find more time 👍

I just realized that one of the reasons this may not be reported already is changing existing enum columns could break a live-running application INSERT/UPDATE queries that rely on enum values by "name" (vs numeric index)

I'm curious how your app will handle that in this use case? In your example above, I'd expect writes that set Fail would suddenly throw errors post-cutover

@disser4
Copy link

disser4 commented Oct 19, 2022

We got the same error as well.

@disser4
Copy link

disser4 commented Oct 19, 2022

2022-10-13 04:47:38 FATAL Error 1048: Column 'payment_type' cannot be null; query=
update /* gh-ost customcheckout._purchase_gho */
customcheckout.`_purchase_gho

@bolt-leandro
Copy link
Author

Hi @timvaillancourt

I'm curious how your app will handle that in this use case? In your example above, I'd expect writes that set Fail would suddenly throw errors post-cutover

The app needs to stop issuing writes that set Fail and only use Pass and MostExcellent.

Some insight on the problem. If doing a native ALTER, it will refuse to change the tbale if it needs to truncate existing Fail values.
I suppose gh-ost could do a similar thing, if it find Fails during copy or in replication, it would bail out. It's better than silent data corruption.

mysql> SELECT * FROM grades;
+-----+---------+--------+
| id  | Name    | Result |
+-----+---------+--------+
| 101 | Aarav   | Pass   |
| 102 | Yashraj | Fail   |
+-----+---------+--------+
2 rows in set (0.00 sec)

mysql> ALTER TABLE grades MODIFY COLUMN Result enum('Pass','MostExcellent') NOT NULL;
ERROR 1265 (01000): Data truncated for column 'Result' at row 2

In MySQL, removing the Fail enum is possible if there are no rows with such column:

mysql> ALTER TABLE grades MODIFY COLUMN Result enum('Pass','Fail','MostExcellent') NOT NULL;
Query OK, 0 rows affected (0.00 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> SELECT * FROM grades;
+-----+---------+--------+
| id  | Name    | Result |
+-----+---------+--------+
| 101 | Aarav   | Pass   |
| 102 | Yashraj | Fail   |
+-----+---------+--------+
2 rows in set (0.00 sec)

mysql> ALTER TABLE grades MODIFY COLUMN Result enum('Pass','MostExcellent') NOT NULL;
ERROR 1265 (01000): Data truncated for column 'Result' at row 2
mysql> update grades set Result='MostExcellent' WHERE Result='Fail';
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> SELECT * FROM grades;
+-----+---------+---------------+
| id  | Name    | Result        |
+-----+---------+---------------+
| 101 | Aarav   | Pass          |
| 102 | Yashraj | MostExcellent |
+-----+---------+---------------+
2 rows in set (0.00 sec)

mysql> ALTER TABLE grades MODIFY COLUMN Result enum('Pass','MostExcellent') NOT NULL;
Query OK, 2 rows affected (0.02 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> SELECT * FROM grades;
+-----+---------+---------------+
| id  | Name    | Result        |
+-----+---------+---------------+
| 101 | Aarav   | Pass          |
| 102 | Yashraj | MostExcellent |
+-----+---------+---------------+
2 rows in set (0.00 sec)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants