Simulation of EBCDIC Binary Collation

With Db2®, you can collate character strings according to a user-defined collating sequence. You can use this feature to simulate EBCDIC binary collation.

As an example of how to simulate EBCDIC collation, suppose you want to create an ASCII database with code page 850, but you also want the character strings to be collated as if the data actually resides in an EBCDIC database with code page 500. See figures below for the definitions of code page 500 and code page 850.

Consider the relative collation of four characters in a EBCDIC code page 500 database, when they are collated in binary:
Character
Code Page 500 Code Point
'a'
X'81'
'b'
X'82'
'A'
X'C1'
'B'
X'C2'
The code page 500 binary collation sequence (the desired sequence) is:
   'a' < 'b' < 'A' < 'B'
If you create the database with ASCII code page 850, binary collation would yield:
Character
Code Page 850 Code Point
'a'
X'61'
'b'
X'62'
'A'
X'41'
'B'
X'42'
The code page 850 binary collation (which is not the desired sequence) is:
   'A' < 'B' < 'a' < 'b'

To achieve the desired collation, you need to create your database with a user-defined collating sequence. A sample collating sequence for just this purpose is supplied with Db2 in the sqle850a.h include file. The content of sqle850a.h is shown in the following.

Figure 1. User-Defined Collating Sequence - sqle_850_500
#ifndef SQL_H_SQLE850A
#define SQL_H_SQLE850A
 
#ifdef __cplusplus
extern "C" {
#endif
 
unsigned char sqle_850_500[256] = {
0x00,0x01,0x02,0x03,0x37,0x2d,0x2e,0x2f,0x16,0x05,0x25,0x0b,0x0c,0x0d,0x0e,0x0f,
0x10,0x11,0x12,0x13,0x3c,0x3d,0x32,0x26,0x18,0x19,0x3f,0x27,0x1c,0x1d,0x1e,0x1f,
0x40,0x4f,0x7f,0x7b,0x5b,0x6c,0x50,0x7d,0x4d,0x5d,0x5c,0x4e,0x6b,0x60,0x4b,0x61,
0xf0,0xf1,0xf2,0xf3,0xf4,0xf5,0xf6,0xf7,0xf8,0xf9,0x7a,0x5e,0x4c,0x7e,0x6e,0x6f,
0x7c,0xc1,0xc2,0xc3,0xc4,0xc5,0xc6,0xc7,0xc8,0xc9,0xd1,0xd2,0xd3,0xd4,0xd5,0xd6,
0xd7,0xd8,0xd9,0xe2,0xe3,0xe4,0xe5,0xe6,0xe7,0xe8,0xe9,0x4a,0xe0,0x5a,0x5f,0x6d,
0x79,0x81,0x82,0x83,0x84,0x85,0x86,0x87,0x88,0x89,0x91,0x92,0x93,0x94,0x95,0x96,
0x97,0x98,0x99,0xa2,0xa3,0xa4,0xa5,0xa6,0xa7,0xa8,0xa9,0xc0,0xbb,0xd0,0xa1,0x07,
0x68,0xdc,0x51,0x42,0x43,0x44,0x47,0x48,0x52,0x53,0x54,0x57,0x56,0x58,0x63,0x67,
0x71,0x9c,0x9e,0xcb,0xcc,0xcd,0xdb,0xdd,0xdf,0xec,0xfc,0x70,0xb1,0x80,0xbf,0xff,
0x45,0x55,0xce,0xde,0x49,0x69,0x9a,0x9b,0xab,0xaf,0xba,0xb8,0xb7,0xaa,0x8a,0x8b,
0x2b,0x2c,0x09,0x21,0x28,0x65,0x62,0x64,0xb4,0x38,0x31,0x34,0x33,0xb0,0xb2,0x24,
0x22,0x17,0x29,0x06,0x20,0x2a,0x46,0x66,0x1a,0x35,0x08,0x39,0x36,0x30,0x3a,0x9f,
0x8c,0xac,0x72,0x73,0x74,0x0a,0x75,0x76,0x77,0x23,0x15,0x14,0x04,0x6a,0x78,0x3b,
0xee,0x59,0xeb,0xed,0xcf,0xef,0xa0,0x8e,0xae,0xfe,0xfb,0xfd,0x8d,0xad,0xbc,0xbe,
0xca,0x8f,0x1b,0xb9,0xb6,0xb5,0xe1,0x9d,0x90,0xbd,0xb3,0xda,0xfa,0xea,0x3e,0x41
};
#ifdef __cplusplus
}
#endif
 
#endif /* SQL_H_SQLE850A */

To see how to achieve code page 500 binary collation on code page 850 characters, examine the sample collating sequence in sqle_850_500. For each code page 850 character, its weight in the collating sequence is simply its corresponding code point in code page 500.

For example, consider the letter ‘a’. This letter is code point X'61' for code page 850. In the array sqle_850_500, letter ‘a’ is assigned a weight of X'81' (that is, the 98th element in the array sqle_850_500).

Consider how the four characters collate when the database is created with the above sample user-defined collating sequence:
Character
Code Page 850 Code Point / Weight (from sqle_850_500)
'a'
X'61' / X'81'
'b'
X'62' / X'82'
'A'
X'41' / X'C1'
'B'
X'42' / X'C2'
The code page 850 user-defined collation by weight (the desired collation) is:
   'a' < 'b' < 'A' < 'B'

In this example, you achieve the desired collation by specifying the correct weights to simulate the desired behavior.

Closely observing the actual collating sequence, notice that the sequence itself is merely a conversion table, where the source code page is the code page of the database (850) and the target code page is the desired binary collating code page (500). Other sample collating sequences supplied by Db2 enable different conversions. If a conversion table that you require is not supplied with Db2, additional conversion tables can be obtained from the IBM® publication, Character Data Representation Architecture, Reference and Registry, SC09-2190. You will find the additional conversion tables in a CD enclosed with that publication.

Figure 2. Code Page 500
CP00500
Figure 3. Code Page 850
CP00850