Mysql – thesqldump dumps different data with and without –no-create-info

character-setencodingMySQLmysqldumputf-8

mysqldump dumps different representations of data when called with/without --no-create-info.

Test case

First, create the test table and populate it with interesting data.

CREATE TABLE `test` (
  `value` longtext
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

-- insert 'crème'
insert into test values(UNHEX('6372C3A86D65'));

Then, dump the data using with and without --no-create-info:

mysqldump test_db test > dump_test.sql
mysqldump --no-create-info test_db test > dump_test2.sql

These files encode crème differently:

  • dump_test.sql encodes it as 0x6372C3A86D65 (è is 0xC3A8)
  • dump_test2.sql encodes it as 0x63725CE86D65 (è is 0x5CE8)

Restoring the dump from dump_test.sql works correctly, a complete row is restored:

mysql test_db < dump_test.sql

Restoring the dump from dump_test2.sql produces an incorrect result where data is truncated to just cr (0x6372). Note that structure of the table is the same:

mysql test_db < dump_test2.sql

No error nor warning is given. These two files differ – besides encoding of crème – only by:

-- Table structure for table `test`
--

DROP TABLE IF EXISTS `test`;
/*!40101 SET @saved_cs_client     = @@character_set_client */;
/*!50503 SET character_set_client = utf8mb4 */;
CREATE TABLE `test` (
  `value` longtext
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
/*!40101 SET character_set_client = @saved_cs_client */;

--

Question

Why does this happen? Is it documented anywhere? Can I/how can I correctly restore the dump created with --no-create-info data?

Additional info

# mysqldump used to dump the database
$ mysqldump -V
mysqldump  Ver 8.0.22-0ubuntu0.20.04.3 for Linux on x86_64 ((Ubuntu))
# Server
version_component: Percona Server (GPL), Release '35', Revision '5688520'
version: 5.7.32-35

Best Answer

Try dumping using --hex-blob

mysqldump --hex-blob test_db test > dump_test.sql
mysqldump --hex-blob --no-create-info test_db test > dump_test2.sql

Please let us know if the hex value remains the same between the two dumps.

UPDATE 2021-01-08 14:15 EST

Try changing character set

The MySQL Docs say that mysqldump will use utf8 if not specified

Internationalization Options The following options change how the mysqldump command represents character data with national language settings.

--character-sets-dir=dir_name

The directory where character sets are installed. See Section 10.15, “Character Set Configuration”.

--default-character-set=charset_name

Use charset_name as the default character set. See Section 10.15, “Character Set Configuration”. If no character set is specified, mysqldump uses utf8.

--no-set-names, -N

Turns off the --set-charset setting, the same as specifying --skip-set-charset.

--set-charset

Write SET NAMES default_character_set to the output. This option is enabled by default. To suppress the SET NAMES statement, use --skip-set-charset.

Just add -N to ignore character sets or use --set-charset to explicitly name the character set you want to use when loading the dump in.