Hướng dẫn mysql unicode
Thông thường khi khởi tạo MySQL bằng lệnh, chúng ta không có được Database hỗ trợ sẵn bảng mã Unicode. Show mysql -u root -p Nhập mật khẩu để đăng nhập vào MySQL. CREATE DATABASE `ten_db` CHARACTER SET utf8 COLLATE utf8_general_ci; Sau khi thực hiện các lệnh trên ta đã tạo được database hỗ trợ sẵn bảng mã Unicode và phân quyền cho user luôn để đảm bảo vấn bảo mật. Khi tạo CSDL, có rất nhiều chuẩn mà MySQL / MariaDB gọi là collation như hình bên dưới: Vậy để tạo database đúng collation được khuyến khích trong các dự án phần mềm sử dụng hệ quản trị CSDL MySQL / MariaDB là gì? Đó là: utf8mb4_unicode_ci Tạo database với collation utf8mb4_unicode_ci trong MySQL / MariaDBĐể tạo database đầu tiên chúng ta cần kết nối tới MySQL / MariaDB Server sudo mysql -u root -p Sau đó sử dụng lệnh sql sau để tạo database: CREATE DATABASE Trong đó: VD: MariaDB [(none)]> CREATE DATABASE vinasupport CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci; Kiểm tra collation của databaseSau khi tạo xong database, các bạn muốn kiểm tra collation của database chúng ta sử dụng câu lệnh SQL sau: MariaDB [(none)]> use vinasupport; MariaDB [vinasupport]> SELECT @@character_set_database, @@collation_database; Kết quả sẽ được giống như là: Vì sao nên sử dụng utf8mb4_unicode_ci để tạo database trong MySQL / MariaDB?Lý do thì vô vàn lắm, mình sẽ liệt ra những ưu điểm như sau:
Nguồn: vinasupport.com 10.10.1 Unicode Character SetsThis section describes the collations available for Unicode character sets and their differentiating properties. For general information about Unicode, see Section 10.9, “Unicode Support”. MySQL supports multiple Unicode character sets:
Note The To avoid ambiguity about the meaning of Most
Unicode character sets have a general collation (indicated by Most character sets have a single binary collation. Collation support for
Unicode Collation Algorithm (UCA) Versions MySQL implements the Unicode collations based on UCA versions higher than 4.0.0 include the version in the collation name. Examples:
The Collation Pad Attributes Collations based on UCA 9.0.0 and higher are faster than collations based on UCA versions prior to 9.0.0. They also have a pad
attribute of To determine the pad
attribute for a collation, use the
Comparison of nonbinary string values (
Language-Specific CollationsMySQL implements language-specific Unicode collations if the ordering based only on the Unicode Collation Algorithm (UCA) does not work well for a language. Language-specific collations are UCA-based, with additional language tailoring rules. Examples of such rules appear later in this section. For questions about particular language orderings, unicode.org provides Common Locale Data Repository (CLDR) collation charts at http://www.unicode.org/cldr/charts/30/collation/index.html. For example, the nonlanguage-specific
A collation name that includes a locale code or language name shown in the following table is a language-specific collation. Unicode character sets may include collations for one or more of these languages. Table 10.3 Unicode Collation Language Specifiers
MySQL 8.0.30 and later provides the Bulgarian collations Croatian collations are tailored for these Croatian letters: MySQL 8.0.30 and later provides the Beginning with MySQL
8.0.30, MySQL provides collations for both major varieties of Norwegian: for Bokmål, you can use For Japanese, the For Classical Latin collations that are accent-insensitive, MySQL 8.0.30 and later provides collations for the Mongolian language when written with Cyrillic characters, Spanish collations are available for modern and traditional Spanish. For both, Traditional Spanish collations may also be used for Asturian and Galician. Beginning with MySQL 8.0.30, MySQL also provides Swedish collations include Swedish rules. For example, in Swedish, the following relationship holds, which is not something expected by a German or French speaker:
_general_ci Versus _unicode_ci Collations For any Unicode character set, operations performed using the To further illustrate, the following equalities hold in both
A difference between the collations is that this is true for
Whereas this is true for
MySQL implements language-specific Unicode
collations if the ordering with If you require German DIN-2 (phone book) ordering, use the
Character Collating WeightsA character's collating weight is determined as follows:
Collating weights can be displayed using the
There is a difference between “ordering by the character's code value” and “ordering by the character's binary representation,” a difference that appears only with Suppose that
The two characters in the chart are in order by code point value because So MySQL's If the character set is Miscellaneous Information The
|