How collating sequences determine sort orders
A collating sequence determines the sort order of the characters in a coded character set.
A character set is the aggregate of characters that are used in a computer system or programming language. In a coded character set, each character is assigned to a different number within the range of 0 to 255 (or the hexadecimal equivalent thereof). The numbers are called code points; the assignments of numbers to characters in a set are collectively called a code page.
In addition to being assigned to a character, a code point can be mapped to the character's position in a sort order. In technical terms, then, a collating sequence is the collective mapping of a character set's code points to the sort order positions of the set's characters. A character's position is represented by a number; this number is called the weight of the character. In the simplest collating sequence, called an identity sequence, the weights are identical to the code points.
Example: Database ALPHA uses the default collating sequence of the EBCDIC code page. Database BETA uses the default collating sequence of the ASCII code page. Sort orders for character strings at these two databases would differ:
SELECT.....
ORDER BY COL2
EBCDIC-Based Sort ASCII-Based Sort
COL2 COL2
---- ----
V1G 7AB
Y2W V1G
7AB Y2W
Example: Similarly, character comparisons in a database depend on the collating sequence defined for that database. Database ALPHA uses the default collating sequence of the EBCDIC code page. Database BETA uses the default collating sequence of the ASCII code page. Character comparisons at these two databases would yield different results:
SELECT.....
WHERE COL2 > 'TT3'
EBCDIC-Based Results ASCII-Based Results
COL2 COL2
---- ----
TW4 TW4
X82 X82
39G