Sort sequence tables
A sort sequence table is an object that contains the weight of each single-byte graphic character within a specified coded character set identifier (CCSID). The system-recognized identifier for the sort sequence table object type is *TBL.
Depending on your requirements, you can define a table to have either a unique weight for each graphic character or shared weights for some graphic characters. If you define a table that contains unique weights for each character within the character set, your table is known as a unique-weight table. If you define a table that contains some graphic characters that share the same weight, your table is known as a shared-weight table. For example, if you want to sort the graphic character capital letter A and the graphic character small letter a together, you can define a shared-weight table. If you want to sort these graphic characters separately, you can define a unique-weight table.
A set of sort sequence tables is included with the system. This set of tables defines both unique-weight and shared-weight sort sequences for all single-byte character set (SBCS) languages.
Sort sequence table implementation notes
Sort sequence support does not take into consideration the following information:
- Special cases of single characters that should be handled as multiple characters (such as the German characters sharp).
- Sequences of characters that should be treated as a single character (such as the Danish aa, Hungarian ly, Serbian lj, Spanish ll).
- Nonalphanumeric characters that should be ignored because they are embedded in alphanumeric strings (such as the hyphen in co-op).
- Prefixes that should be ignored (such as Van der in the name Van der Pool).
- Program-described files.
- DBCS code pages.
If a sort sequence table has a weight other than hexadecimal 40 assigned to the blank character, unpredictable results can occur when strings of unequal lengths are compared.
Sort sequence tables included with the system
You can use the Work with Tables (WRKTBL) command to view the contents of the sort sequence tables that are included with the IBM i operating system. The tables are located in the QSYS library.
When looking at these tables, consider the following information:
- Several tables included with the system represent a single sort sequence, each encoded with a different coded character set identifier (CCSID) value. Not all of the characters in a given sort sequence exist in every CCSID in which the sort sequence is encoded.
- Use the language identifier (LANGID) parameter and the sort sequence (SRTSEQ) parameter to access the unique-weight tables (*LANGIDUNQ) or the shared-weight tables (*LANGIDSHR).
- When using the sort sequence, the relative weights shown in these tables differ from the actual weights in the sort sequence table on the system. The relative weights shown in these tables are examples only.
- The relative unique weight of a character is shown by the order of the characters in the sort sequence table. The relative unique weight is determined by assigning a weight of 1 to the first character in the sort sequence table and incrementing by 1 for each of the following characters until the end of the table is reached.
- GCGID is the graphic character global identifier.
For example, the Arabic sort sequence table shows the relative sort sequence weights for characters that are sorted using the Arabic sort sequence table.
How to build sort sequence tables
To create a user-defined sort sequence table, copy an existing sort sequence table using the Create Table (CRTTBL) command, and then modify the copy of the table. Table functions allow you to perform the following tasks:
- Use a definition stored in a source member.
- Create a table based on another sort sequence table using an interactive interface.
You can create a sort sequence table (MYTEST) from a copy of an existing table using the following CRTTBL command:
CRTTBL TBL(MYTEST) SRCFILE(*PROMPT) TBLTYPE(*SRTSEQ)
BASESRTSEQ(QSYS/QLA10025S) CCSID(037)
This command displays a sort sequence table that you can modify. Your table is created from a function key on this display. Your resulting table has a coded character set identifier (CCSID) value of 00037. The table is named MYTEST and is stored in the current library.
The following table shows one way in which the resulting characters may be shown on the first display of the MYTEST sort sequence table. The actual panel shows characters instead of text descriptions. For example, the character shown for sequence 0100 is a question mark (?), and the character shown for sequence 0070 is a colon (:).
Sequence | Character |
---|---|
0010 | Equal sign |
0020 | Overline |
0030 | (SHY) |
0040 | Hyphen |
0050 | Comma |
0060 | Semi-colon |
0070 | Colon |
0080 | Exclamation mark |
0090 | Inverted exclamation mark |
0100 | Question mark |
0110 | Inverted question mark |
0120 | Slash |
0130 | Period |
0140 | Acute accent mark |
0150 | Grave accent mark |
0160 | Caret |
0170 | Right square bracket |
0180 | Tilde |
0190 | Small multiply dot |
0200 | Comma |
You can make changes to the tables to move characters in each code page to the preferred position for the national language sort sequence table. The ordering is done by increments of 10. Therefore, the first value is 10, then 20, and so on. If some characters have a shared weight, these groups of characters have the same sequenced weight.