DRDA and CDRA support

A distributed relational database might not only span different types of computers, but those computers might be in different countries or regions.

Identical systems can encode data differently depending on the language used on the system. Different systems encode data differently. For instance, a z/OS® product, a IBM® i product, and a Windows system that are running the DB2® for Linux, UNIX, and Windows licensed program encode numeric data in their own unique formats. In addition, a z/OS and a IBM i product use the EBCDIC encoding scheme to encode character data, while a Windows system that is running DB2 LUW uses an ASCII encoding scheme.

For numeric data, these differences do not matter. Unlike systems that provide Distributed Relational Database Architecture™ (DRDA) support automatically convert any differences between the way a number is represented in one computer system to the way it is represented in another. For example, if an IBM i application program reads numeric data from a Db2® for i database, Db2 for i sends the numeric data in the z/OS format, and the IBM i database management system converts it to IBM i numeric format.

However, the handling of character data is more complex, but this too can be handled within a distributed relational database.

Character conversion with CDRA

Not only can there be differences in encoding schemes, such as Extended Binary Coded Decimal Interchange Code (EBCDIC) versus American Standard Code for Information Interchange (ASCII), but there can also be differences related to language.

For instance, systems configured for different languages can assign different characters to the same code, or different codes to the same character. For example, a system configured for U.S. English can assign the same code to the character } that a system configured for the Danish language assigns to å. But those two systems can assign different codes to the same character such as $.

If data is to be shared across different systems, character data needs to be seen by users and applications the same way. In other words, a Windows user in New York and an IBM i user in Copenhagen both need to see a $ as a $, even though $ might be encoded differently in each system. Furthermore, the user in Copenhagen needs to see a }, if that is the character that was stored at New York, even though the code might be the same as a Danish å. In order for this to happen, the $ must be converted to the proper character encoding for a Windows system (that is, U.S. English character set, ASCII), and converted back to Danish encoding when it goes from New York to Copenhagen (that is, Danish character set, EBCDIC). This sort of character conversion is provided for by IBM i as well as the other IBM distributed relational database managers. This conversion is done in a coherent way in accordance with the Character Data Representation Architecture (CDRA).

CDRA specifies the way to identify the attributes of character data so that the data can be understood across systems, even if the systems use different character sets and encoding schemes. For conversion to happen across systems, each system must understand the attributes of the character data it is receiving from the other system. CDRA specifies that these attributes be identified through a coded character set identifier (CCSID). All character data in DB2 for z/OS, DB2 for VM, and the IBM i database management systems have a CCSID, which indicates a specific combination of encoding scheme, character set, and code page. All character data in an Extended Services environment has only a code page (but the other database managers treat that code page identification as a CCSID). A code page is a specific set of assignments between characters and internal codes.

For example, CCSID 37 means encoding scheme 4352 (EBCDIC), character set 697 (Latin, single-byte characters), and code page 37 (USA/Canada country extended code page). CCSID 5026 means encoding scheme 4865 (extended EBCDIC), character set 1172 with code page 290 (single-byte character set for Katakana/Kanji), and character set 370 with code page 300 (double-byte character set for Katakana/Kanji).

DRDA-enabled systems include mechanisms to convert character data between a wide range of CCSID-to-CCSID pairs and CCSID-to-code page pairs. Character conversion for many CCSIDs and code pages is already built into these products. For more information about CCSIDs supported by IBM i, see the IBM i globalization topic.