MySQL: Set UTF-8 Throughout

Submitted by nigel on Saturday 7th November 2009

The MySQL manual states By default, MySQL uses the latin1 (cp1252 West European) character set and the latin1_swedish_ci collation that sorts according to Swedish/Finnish rules. These defaults are suitable for the United States and most of Western Europe. That's certainly true but only half the story. If you have ever developed PHP or Perl applications, sooner or later you are bound to notice inexplicable hieroglyphics in your website output. Closer inspection of these characters and you'll probably realise it is an apostrophe or an accented letter being rendered incorrectly.

The trick to avoiding this is to ensure you are consistent throughout your development stack with your choice of character sets. To aid this I change the MySQL defaults to UTF-8 General which fits all circumstances I tend to need, and will match the PHP defaults.

The MySQL my.cnf configuration file is where you'll need to look. I use openSUSE and the config file hangs out in /etc. Lets check to be sure.

# whereis my.cnf
Fine, so open this file up with your favourite editor. Find the [client] block and add a default-character-set value as below
socket = /var/lib/mysql/mysql.sock
port = 3306
default-character-set = utf8
Your socket and port values may be different - keep them as they are. Next, add / replace as applicable the following keys to the [mysqld] block. This block will contain many values so for the sake of brevity, only the pertinent ones are listed.
default-character-set = utf8
default-collation = utf8_general_ci
character-set-server = utf8
collation-server = utf8_general_ci
init_connect = 'SET NAMES utf8'
That should do it, but don't forget to restart the mysqld daemon with (on openSUSE)
# /etc/init.d/mysql restart
and check the log with
# tail /var/lib/mysql/mysqld.log
Everything should be ok. If not, retrace your steps and make sure you have followed the instructions correctly.
blog terms