mysqlに郵便番号データをつっこむ(文字化け解消)

mysql> create database zip;
Query OK, 1 row affected (0.02 sec)

mysql> 
mysql> use zip
Database changed
mysql> 
mysql> 
mysql> CREATE TABLE zip (
    ->     publicCode      int          NOT NULL ,
    ->     OldZipcode      varchar(5)   NOT NULL ,
    ->     ZipCode         varchar(7)   NOT NULL ,
    ->     PrefectureKana  varchar(30)  NOT NULL ,
    ->     CityKana        varchar(300) NOT NULL ,
    ->     TownKana        varchar(300) NOT NULL ,
    ->     Prefecture      varchar(30)  NOT NULL ,
    ->     City            varchar(300) NOT NULL ,
    ->     Town            varchar(300) NOT NULL ,
    ->     TownDivideFlg   int          NOT NULL ,
    ->     KoazaBranchFlg  int          NOT NULL ,
    ->     ChomeFlg        int          NOT NULL ,
    ->     HasSomeTownsFlg int          NOT NULL ,
    ->     UpdateStatus    int          NOT NULL ,
    ->     UpdateReaso     int          NOT NULL 
    -> )
    -> ;
Query OK, 0 rows affected (0.05 sec)

mysql> select * from zip;
Empty set (0.01 sec)

mysql> 


つっぱしってみるw

mysql> 
mysql> use zip
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> 
mysql> 
mysql> show tables;
+---------------+
| Tables_in_zip |
+---------------+
| zip           | 
+---------------+
1 row in set (0.00 sec)

mysql> 
mysql> LOAD DATA LOCAL INFILE './ken_all.utf8.csv'
    -> INTO TABLE `zip`
    -> FIELDS TERMINATED BY ','
    -> OPTIONALLY ENCLOSED BY '"'
    -> ;
Query OK, 123314 rows affected (1.51 sec)
Records: 123314  Deleted: 0  Skipped: 0  Warnings: 0

mysql> select count(*) from zip;
+----------+
| count(*) |
+----------+
|   123314 | 
+----------+
1 row in set (0.00 sec)

mysql> select city,town from zip where zipcode like '2030013';
+-------+------+
| city  | town |
+-------+------+
| ????? | ???  | 
+-------+------+
1 row in set (0.23 sec)

mysql> 

ありゃ、闇くもにつっぱしったら、文字ばけた。

あれあれ

[hirasawa@cent5-64b-40 etc]$ pwd
/etc
[hirasawa@cent5-64b-40 etc]$ 
[hirasawa@cent5-64b-40 etc]$ ls -ltr | grep my
-rw-r--r--  1 root   root        441  5月 10  2011 my.cnf.orig
-rw-r--r--  1 root   root        469 10月  3  2011 my.cnf
[hirasawa@cent5-64b-40 etc]$ 
[hirasawa@cent5-64b-40 etc]$ cat my.cnf
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
user=mysql
# Default to using old password format for compatibility with mysql 3.x
# clients (those using the mysqlclient10 compatibility package).
old_passwords=1

# Disabling symbolic-links is recommended to prevent assorted security risks;
# to do so, uncomment this line:
# symbolic-links=0
character-set-server = utf8

[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid


[hirasawa@cent5-64b-40 etc]$ diff ./my.cnf.orig my.cnf
11a12
> character-set-server = utf8
[hirasawa@cent5-64b-40 etc]$

なにがわるい?つーわけでぐぐってみてここを発見
http://linuxserver.jp/%E3%82%B5%E3%83%BC%E3%83%90%E6%A7%8B%E7%AF%89/DB/MySQL/%E6%96%87%E5%AD%97%E5%8C%96%E3%81%91%E8%A7%A3%E6%B6%88.php
MySQLの文字化けを解消する

my.cnfを次にようにかきかえる

[root@cent6-64-39 etc]# cat my.cnf
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
user=mysql
# Default to using old password format for compatibility with mysql 3.x
# clients (those using the mysqlclient10 compatibility package).
old_passwords=1

# Disabling symbolic-links is recommended to prevent assorted security risks;
# to do so, uncomment this line:
# symbolic-links=0

default-character-set = utf8
skip-character-set-client-handshake
character-set-server = utf8
collation-server = utf8_general_ci
init-connect = SET NAMES utf8

[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
[root@cent6-64-39 etc]# 
mysql> show variables like 'char%';
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | latin1                     | 
| character_set_connection | latin1                     | 
| character_set_database   | utf8                       | 
| character_set_filesystem | binary                     | 
| character_set_results    | latin1                     | 
| character_set_server     | utf8                       | 
| character_set_system     | utf8                       | 
| character_sets_dir       | /usr/share/mysql/charsets/ | 
+--------------------------+----------------------------+
8 rows in set (0.00 sec)

mysql> 

latin1あたりをutf8に変更してmysql再起動

[root@cent5-64b-40 ~]# diff /etc/my.cnf.orig /etc/my.cnf
12a13,18
> default-character-set = utf8
> skip-character-set-client-handshake
> character-set-server = utf8
> collation-server = utf8_general_ci
> init-connect = SET NAMES utf8
> 
[root@cent5-64b-40 ~]#

[root@cent5-64b-40 ~]# /etc/init.d/mysqld restart
MySQL を停止中:                                            [  OK  ]
MySQL を起動中:                                            [  OK  ]
[root@cent5-64b-40 ~]# 

mysql> show variables like 'char%';
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8                       | 
| character_set_connection | utf8                       | 
| character_set_database   | utf8                       | 
| character_set_filesystem | binary                     | 
| character_set_results    | utf8                       | 
| character_set_server     | utf8                       | 
| character_set_system     | utf8                       | 
| character_sets_dir       | /usr/share/mysql/charsets/ | 
+--------------------------+----------------------------+
8 rows in set (0.00 sec)

mysql> 


mysql> select city,town from zip where zipcode = '2030013';
+-----------------+-----------+
| city            | town      |
+-----------------+-----------+
| 東久留米市 | 新川町 | 
+-----------------+-----------+
1 row in set (0.18 sec)

mysql> 

ヨッシャー

いかトラブったときに参考になるとおもいます
http://blog.goo.ne.jp/puri69/e/0d7c07c8765e3b83ba479ffd6ec9bce0