Differences between Unicode and EBCDIC sorting sequences

In Unicode, numeric characters are sorted before alphabetic characters. In EBCDIC, alphabetic characters are sorted before numeric characters.

Because the DB2® catalog is stored in Unicode, any queries that you issue against Unicode tables in the catalog use the Unicode sorting sequence.

Also, consider any SQL statements that include syntax that requires that the data be sorted. Examples of such syntax include the GROUP BY clause, range predicates such as BETWEEN, and functions such as MIN and MAX. These statements might return different results when they are issued on Unicode data than on EBCDIC data.

The following table shows some example encoding differences to consider when specifying these clauses, predicates, and functions in your SQL statements.

Table 1. Example encoding differences
EBCDIC Unicode and ASCII
Characters Hexadecimal value Characters Hexadecimal value
space X'40' space X'20'
lowercase characters
X'81' - X'89'
X'91' - X'99'
X'A1' - X'A9'
numerals X'30' - X'39'
uppercase characters
X'C1' - X'C9'
X'D1' - X'D9'
X'E1' - X'E9'
uppercase characters
X'40' - X'4F'
X'50' - X5A'
numerals X'F0' - X'F9' lowercase characters
X'61' - X'6F'
X'70' - X7A'
Equal predicates are not affected by the different sorting sequences.

Examples

Begin general-use programming interface information.

For the following examples, assume that a table called MYTABLES has a NAME column that is type VARCHAR(128). This column contains the following values: TEST1, TEST2, TEST3, TESTA, TESTB, and TESTC.

Example query with ORDER BY: Suppose that you issue the following SQL query:
SELECT NAME FROM MYTABLES
ORDER BY NAME

If MYTABLES is encoded in Unicode, DB2 returns the following result:

TEST1
TEST2
TEST3
TESTA
TESTB
TESTC

If MYTABLES is encoded in EBCDIC, DB2 returns the following result:

TESTA
TESTB
TESTC
TEST1
TEST2
TEST3
Example of query with ORDER BY and BETWEEN predicate: Assume that you issue the following SQL query:
SELECT * FROM MYTABLES
WHERE NAME BETWEEN 'TEST2' AND 'TESTB'
ORDER BY NAME

If MYTABLES is encoded in Unicode, DB2 returns the following result:

TEST3
TESTA

If MYTABLES is encoded in EBCDIC, DB2 returns the following result:

TESTC
TEST1

To simulate the behavior of the ORDER BY clause on EBCDIC data, use the CAST function and the ORDER BY clause when you query the DB2 catalog or other Unicode data.

Example of simulating the EBCDIC sorting sequence: Suppose that MYTABLES is encoded in Unicode. You can modify the preceding query as follows to return the Unicode data in the same order that you would expect for EBCDIC data:
SELECT CAST(NAME AS VARCHAR(128) CCSID EBCDIC) AS E_NAME
FROM MYTABLES
ORDER BY E_NAME

DB2 returns the following result:

TESTA
TESTB
TESTC
TEST1
TEST2
TEST3

However, be aware that, in this situation, DB2 cannot use an index to support the ORDER BY clause. DB2 must sort the data.

You can also apply this same technique to a catalog table in UTF-8, as shown in the following example
SELECT CAST(NAME AS VARCHAR(128) CCSID EBCDIC) AS E_NAME
FROM SYSIBM.SYSTABLES
ORDER BY E_NAME
If the NAMES column of SYSIBM.SYSTABLES contains the values TEST1, TEST2, TEST3, TESTA, TESTB, and TESTC, DB2 returns the following result:
TESTA
TESTB
TESTC
TEST1
TEST2
TEST3
End general-use programming interface information.