Archive for category MySQL
Convert a MySQL DB from latin1 to UTF8
For work, I have to convert my very old database from latin1 based( with big5 in it ) to UTF8.
My Enviornment:
- OS: Gentoo
- Python: Python 2.4.3
- MySQL: MySQL 5.0.70-log Gentoo Linux mysql-5.0.70-r1
- Old encoding: latin1(big5)
- Try to convert to: UTF8
And here’s my solution, and memo.
Assume your private parameters as following:
- Your old db name: your_old_db
- Your new db name: your_new_db
- Your mysql account: your_db_acc
- Your mysql server’s ip or dn: your_db_host
mysqldump -u wwwuser -p –default-character-set=latin1 your_old_db > output.sql
# — fix origin output.sql’s big5 problem here ( do some regex replacement, it depends ) —
# convert the "real storage encoding" from "big5" to utf8, latin1 is just a tag note in db
piconv -f big5 -t utf8 output.sql > utf8.sql
# replace all charset and collate in sql file, and delete the line which include "SET NAMES latin1"
# for matching, the longer the better ex: match from "type=Mxxx….", you could do this from the beginning of each single line
sed -e ‘s/DEFAULT CHARSET=latin1/DEFAULT CHARSET=utf8/g’ -e ‘s/CHARSET=gbk/CHARSET=utf8/g’ -e ‘s/ collate gbk_bin//g’ -e ‘s/ COLLATE=gbk_bin//g’ -e ‘/SET NAMES latin1/d’ utf8.sql > utf8_after_sed.sql
# echo SET NAMES utf8 to a new file named tmp.sql
echo "SET NAMES utf8;" > tmp.sql
# merge the two files
cat utf8_after_sed.sql >> tmp.sql
# rename the file to utf_final.sql, just a clarify
mv tmp.sql utf8_final.sql
# import to a new unicode based db, we’re done here
mysql -h your_db_host -u your_db_acc -p –default_character_set utf8 your_new_db < utf8_ultimate.sql
Thanks sooooooo a lot for Tib‘s kindly provide his consultation.
Recent Comments