数据库编码与词语定序

使用现代编码和定序驱动你的应用程序

如果你安装的数据库版本为 MySQL 8.0,那么其默认的数据库编码为 utf8mb4,默认的 collation(词语定序)为 utf8mb4_0900_ai_ci,使用命令

SHOW VARIABLES WHERE Variable_name LIKE 'character_set_%' OR Variable_name LIKE 'collation%';

可以查看到

+--------------------------+-------------------------------------------------------+
| Variable_name            | Value                                                 |
+--------------------------+-------------------------------------------------------+
| character_set_client     | utf8mb4                                               |
| character_set_connection | utf8mb4                                               |
| character_set_database   | utf8mb4                                               |
| character_set_filesystem | binary                                                |
| character_set_results    | utf8mb4                                               |
| character_set_server     | utf8mb4                                               |
| character_set_system     | utf8mb3                                               |
| character_sets_dir       | /usr/local/mysql-8.0.28-macos11-arm64/share/charsets/ |
| collation_connection     | utf8mb4_0900_ai_ci                                    |
| collation_database       | utf8mb4_0900_ai_ci                                    |
| collation_server         | utf8mb4_0900_ai_ci                                    |
+--------------------------+-------------------------------------------------------+

如果你安装的数据库版本为 MySQL 5.7,那么默认的数据库编码为 utf8,默认的 collation 是 utf8_general_ci,同样也可以用命令查看到。

数据库编码对比:

数据库 collation 对比:

要修改数据库编码,如果系统为 Mac OS 或者 Linux,请修改 my.cnf,如果是 Windows 则修改 my.ini 为如下配置:

[client] 
default-character-set = utf8mb4 
[mysql] 
default-character-set = utf8mb4 
[mysqld] 
character-set-client-handshake = FALSE 
character-set-server = utf8mb4 
collation-server = utf8mb4_0900_ai_ci
init_connect='SET NAMES utf8mb4'

如果 MySQL 版本为 5.7,则 collation-server 可以应该设置为 utf8mb4_unicode_520_ci

然后重启数据库即可。

重启之后使用命令:

SHOW VARIABLES WHERE Variable_name LIKE 'character_set_%' OR Variable_name LIKE 'collation%';

得到下表:

+--------------------------+-------------------------------------------------------+
| Variable_name            | Value                                                 |
+--------------------------+-------------------------------------------------------+
| character_set_client     | utf8mb4                                               |
| character_set_connection | utf8mb4                                               |
| character_set_database   | utf8mb4                                               |
| character_set_filesystem | binary                                                |
| character_set_results    | utf8mb4                                               |
| character_set_server     | utf8mb4                                               |
| character_set_system     | utf8mb3                                               |
| character_sets_dir       | /usr/local/mysql-8.0.28-macos11-arm64/share/charsets/ |
| collation_connection     | utf8mb4_0900_ai_ci                                    |
| collation_database       | utf8mb4_0900_ai_ci                                    |
| collation_server         | utf8mb4_0900_ai_ci                                    |
+--------------------------+-------------------------------------------------------+

我们要确认如下字段的值必须为 utf8mb4

如果你已经拥有了一个数据库,那么他的编码可能不会改变,使用命令修改数据库编码:

ALTER DATABASE database_name CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

修改已有表的编码:

ALTER TABLE table_name CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

最后,连接数据库时 JDBC url 记得使用这两个参数:

jdbc:mysql://localhost:3306/mydatabase?useUnicode=true&character_set_server=utf8mb4

不要使用 characterEncoding 参数!

最后更新于