If you do a lot of exporting and importing to different database servers, you’ll be familiar with the frustration of encountering MySQL import errors. Every so often when importing a WordPress dump file into a client’s database, I will encounter an Unknown collation error like the following:
Unknown collation: 'utf8mb4_unicode_ci'
Sometimes it will come up as:
Unknown collation: 'utf8mb4_unicode_520_ci'
This is caused by a difference in encoding types between the source and destination databases. It usually happens when you export from a newer MySQL database (MySQL 5.5.3 and above) which uses
utf8mb4, then attempt to import into an older version using
utf8. If you are importing from a dump file generated from a MySQL 5.6 database, you may get the utf8mb4_unicode_520_ci message. The 520 refers to MySQL’s use of Unicode Collation Algorithm 5.2.0. Unknown collation errors may also happen if you are trying to import a MariaDB database into MySQL. I tend to get unknown collation errors with my Rackspace Cloud accounts after Rackspace started offering MariaDB as a database option.
Ideally one would upgrade the older destination database but this isn’t always a realistic option. There are a number of discussion threads on the WordPress forum about what to do. Fortunately, many web hosting accounts have a phpMyAdmin interface which provides an easy work-around for the problem.
Format-specific options during a phpMyAdmin database export
- Log in to your database server using phpMyAdmin
- Make sure you select your database and go to the “Export” tab
- Select the “Custom” radio button
- Go the section “Format-specific options” and in the setting for “Database system or older MySQL server to maximize output compatibility with:” select MYSQL40.
- Scroll to the bottom and click GO.
Other possible solutions include:
- doing a search and replace on the database dump file;
- converting the character set to
- converting the database for full Unicode support.
Since many of my WordPress database migrations are under my migration service, I don’t always have control over the client’s platform. The phpMyAdmin export format method is often the simplest solution.
Side-effects of a character encoding downgrade
You might be wondering about the purpose of encoding types and if there will be any side-effects of downgrading. Character encoding allows support for a set of characters, such as the Western alphabet, Asian scripts and non-alphanumeric symbols. Older
utf8 databases support a smaller set of characters whereas
utf8mb4 includes emojis, musical notation and Chinese Han characters. If you’ve ever exported a website from one CMS to another and found random characters scattered throughout the copy, it’s because of an incompatible character encoding.
Solving the unknown collation error as described here could mean you’ll end up with unsupported characters after your site migration. However, as with many of my Drupal to WordPress migration clients, in all likelihood you’ll be migrating from an older
utf8 Drupal database to a newer
utf8mb4-supported WordPress database. In this case, your old content will not have characters that will cause a problem after an encoding downgrade.