Skip to content

Exception in upgrade from Drupal if locales tables contain special chars in blob columns #7059

@indigoxela

Description

@indigoxela

Description of the bug

When converting the table columns locales_source.source and locales_target.translation, it's assumed that utf8mb4 conversion already happened.

In Drupal 7 translations could contain any sort of char, as the columns are BLOB. Backdrop converts them to TEXT, though, and if that happens while running an upgrade, this can lead to database exceptions. At that point the database might not be able to handle such chars, yet.

Steps To Reproduce

  1. Prepare a multilingual D7 site for upgrade
  2. Prepare an uninstalled B site for import
  3. Translate something in Drupal using a char not usable in utf8mb3
  4. In my testing I translated "Home" to "Home 😄" - this works just fine in D7
  5. Get a full dump of that D7 site
  6. Import D7 db dump into B db
  7. run core/update.php

Actual behavior

Updates seems to run, but one of the "messages" is:

Failed: PDOException: SQLSTATE[22007]: Invalid datetime format: 1366 Incorrect string value: '\xF0\x9F\x98\x81' for column `bd_testing`.`locales_target`.`translation` at row 19: ALTER TABLE {locales_target} CHANGE `translation` `translation` TEXT NOT NULL COMMENT 'Translation string value in this language.'; Array ( ) in db_change_field() (Zeile 3344 von /var/www/testing/html/core/includes/database/database.inc).

That's the smiley.

As a result, the update hook locale_update_1005() breaks somewhere. Other updates continue, though.
Confirm that locale "isn't done" yet by logging in after upgrade and visiting the status page:
"locale (1004 → 1005)" shows as pending.

It's still not possible to run it without exception.

If you now run the utf8mb4 conversion, and run the pending update after that, things will work just fine - the character doesn't cause an exception, the update hook succeeds.

Expected behavior

The upgrade runs smoothly without multiple attempts.

Additional information

  • Backdrop CMS version: problem very likely always existed, but only affects specific sites
  • Web server and its version: any
  • PHP version: any
  • Database sever (MySQL or MariaDB?) and its version: apparently MariaDB and MySQL

This problem has been discovered with i18n_string, where this problem entirely broke the upgrade. A fix in that module is prepared, but the core problem needs a fix, too.

Although a smiley char is quite unlikely, there are (obviously) other characters more likely to be used in translations (or translation sources in custom code).

I only tested the problem with a D7 database collation of utf8mb3_general_ci (note the "3"). I'm assuming the problem strikes a lot harder with other (older) default collations.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions