Character Set and Sort Order

Your products are globalized and support Unicode. IBM strongly recommends choosing a Unicode encoding for your database and the most appropriate sort order for your environment. A database character set determines which languages a database can represent. Database sort order determines collation and comparison behavior.

The sections below list the most appropriate Unicode character encoding and sort order for each RDBMS that your products support. If you want to use a different character set or sort order than recommended below, consult your database administrator and your RDBMS vendor's documentation so you can carefully choose a database character set that supports the languages your data is in.

If you use the Database Component Configurator to create your database components, you can check whether the selected RDBMS is configured for the Unicode character set. If the RDBMS does not support Unicode, the configurator lists the character set the RDBMS does support.

Important: You must set character set and sort order before creating storage.

DB2

Database schemas for DB2 use character data types. DB2 supports UTF-8 for character data types and UTF-16 for graphic data types.

The table below lists the character sets and sort order recommended by IBM.

For IBM Recommendation
Character set

CCSID 1208 (UTF-8)

IBM My webMethods Server requires this character set.

Graphic Character Set UTF-16
Sort order

IDENTITY_16BIT

This sort order ensures the same sorting result for both character and graphic data types.

You can check the database configuration using the GET DATABASE CONFIGURATION command.

MySQL Community Edition and Enterprise Edition

The server character set and collation are used as default values if the database character set and collation are not specified in CREATE DATABASE statements. They have no other purpose.

You can determine the current server character set and collation settings from the values of the character set server and collation server system variables. You can change these variables at runtime.

The table below lists the character set and sort order recommended by IBM.

For IBM Recommendation
Character set UTF-8
Collation utf8_general_ci

You can check the database configuration using the SHOW VARIABLES LIKE command.

Oracle

Database schemas for Oracle use character data types. For character data types, Oracle supports the UTF8 and AL32UTF8 Unicode encodings. While UTF8 is CESU-8 compliant and supports the Unicode 3.0 UTF-8 Universal character set, AL32UTF8 conforms to the Unicode 3.1 or higher UTF-8 Universal character set. For nchar data types, Oracle supports the AL32UTF8 and AL16UTF16 Unicode encodings. The supported Unicode version for AL32UTF8 depends on the Oracle database version. Oracle database schemas for your products do not have linguistic indexes.

The table below lists the character sets and sort order recommended by IBM.

For IBM Recommendation
Character set AL32UTF8
Nchar character set AL16UTF16
Sort order Binary

You can check database configuration and session settings by viewing the SYS.NLS_DATABASE_PARAMETERS or V$NLS_PARAMETERS parameter.

PostgreSQL

PostgreSQL uses UTF-8 encoding by default.

SQL Server

Database schemas for SQL Server use nchar data types. SQL Server provides support for UTF-16 through its nchar data types. Since nchar data types are always in UTF-16, you do not have to perform any special database configuration.

Some products, such as Process Engine, require a double-byte character set (DBCS). Choose the most appropriate code page for your environment as a database character set.

The table below lists the character sets and sort order recommended by IBM.

For IBM Recommendation
Character set The appropriate encoding for the languages your data is in.
Nchar character set UTF-16
Sort order

Any case-insensitive collation type.

If you do not choose a case-insensitive sort order, you will not be able to create some database components in SQL Server.

You can check the database configuration using the sp_helpdb database stored procedure.