About this user

Filip Hajny http://www.textdrive.com/

« Newer Snippets
Older Snippets »
1 total  XML / RSS feed 

Convert a db to UTF8 after upgrading to MySQL 4.1

If you've ever used a UTF8 application on a pre-4.1 MySQL server, or never cared about encodings on a 4.1 setup even, you may have a non-UTF8 database containing UTF8 data. While this doesn't bother most applications (e.g. PHP weblogs), it's not clean and you can't sort properly with any non-Western characters. This procedure will fix it:

mysqldump --user=username --password=password --default-character-set=latin1 --skip-set-charset dbname > dump.sql
chgrep latin1 utf8 dump.sql
mysql --user=username --password=password --execute="DROP DATABASE dbname; CREATE DATABASE dbname CHARACTER SET utf8 COLLATE utf8_general_ci;"
mysql --user=username --password=password --default-character-set=utf8 dbname < dump.sql


The chgrep part is important, because the table definitions in your dump will likely have "latin1" preserved. If you don't have chgrep, you may use any search-and-replace capable editor, but remember that it must open and save UTF8 properly. Edit: Instead of 'chgrep', you can use 'sed' e.g.:

sed -i "" 's/latin1/utf8/g' dump.sql


Alternatively you may attach "--skip-create-options" to the mysqldump command, but that could omit some needed options (e.g. PACK_KEYS=1 etc.).

You may change the utf8_general_ci collation to whatever you need, e.g. utf8_czech_ci for my purposes.

Edit: Fixed the typo as per the comment below.
« Newer Snippets
Older Snippets »
1 total  XML / RSS feed