Unicode CCSIDs
DB2® for z/OS® uses CCSIDs 367, 1200, and 1208 for Unicode data.
- 367
- DB2 uses ASCII CCSID 367
for single-byte character data (SBCS) because the first 128 code points
in Unicode UTF-8 are the same as the those in ASCII CCSID 367.
Therefore, DB2 uses CCSID 367 for CHAR, VARCHAR, and CLOB columns that are defined with FOR SBCS DATA in Unicode tables.
- 1208
- DB2 uses CCSID 1208 for
Unicode UTF-8 data, which DB2 always
considers to be mixed data. This CCSID is the default CCSID value
for Unicode tables.
Therefore, DB2 uses CCSID 1208 for CHAR, VARCHAR, and CLOB columns that are defined with FOR MIXED DATA in Unicode tables. FOR MIXED DATA is the default subtype specification.
- 1200
- DB2 uses CCSID 1200 for
Unicode UTF-16 data, which is double-byte data (DBCS). This CCSID
applies to GRAPHIC and VARGRAPHIC Unicode data.
Therefore, DB2 uses CCSID 1200 for GRAPHIC, VARGRAPHIC, and DBCLOB columns in Unicode tables.
CCSIDs usually refer to a code page at a particular point in time. However, the Unicode CCSIDs that DB2 for z/OS uses are an exception. They can expand to include more characters as the Unicode standard grows. For example, CCSID 1200 can include the characters from the Unicode standard code pages 13488 (Unicode 2.0) and 17584 (Unicode 3.0). You can determine the CCSID for each Unicode standard code page by looking at the list of registered CCSIDs.
Because DB2 uses this architecture for CCSIDs, you can easily migrate to new versions of the Unicode standard by just updating your conversion image. However, the disadvantage to this architecture is that the CCSID value does not clearly tell you which characters are supported. To check which Unicode standard is currently supported for a particular conversion, issue the system DISPLAY UNI command.
CUN3000I 09.33.37 UNI DISPLAY 754
ENVIRONMENT: CREATED 01/25/2010 AT 00.20.12
MODIFIED 01/25/2010 AT 00.25.10
IMAGE CREATED --/--/---- AT --.--.--
SERVICE: CHARACTER CASE NORMALIZATION COLLATION
STRINGPREP BIDI CONVERSION INF
STORAGE: ACTIVE 1995 PAGES
FIXED 0 PAGES
LIMIT 524288 PAGES
CASECONV: ENABLED
CASE VER: UNI300 NORMAL SPECIAL LOCALE
NORMALIZE: DISABLED
NORM VER: NONE
COLLATE: DISABLED
COLL RULES: NONE
STRPROFILES: NONE
CONVERSION: 00367-05123-R 00437-00819-R
00273-01208-R 01140-01252-E
01140-01252-R 00437-00850-E
00437-00850-R 01200(17584)-01140-E
01200(17584)-01141-E 01200(17584)-01142-E
01200(17584)-01144-E 00273-01252-E
00273-01252-R 01200(17584)-01148-E
00367-05210-R 00850-01200(13488)-R
01142-00367-E 00836-00367-E
01386-00836-R 01148-01200(17584)-R
01386-00935-RE 00437-01140-E
00437-01140-R 00437-01148-E
00437-01148-R 00437-01208-R
...
The CONVERSION section of this output lists all
of the CCSID conversions that are defined. For example, the line 01200(17584)-01141-E defines
the conversion between CCSID 1200 and CCSID 1141. DB2 uses CCSID 1200 for Unicode UTF-16 data.
The number in parentheses after CCSID 1200, 17584, means that in this
conversion, CCSID 1200 uses Unicode standard 3.0. In the line 00850-01200(13488)-R,
CCSID 1200 is followed by a different number, 13488. For this conversion,
CCSID 1200 uses Unicode standard 2.0. The letters E and R represent
the type of conversion. E means that the conversion is an enforced
subset conversion. R means that the conversion is a round-trip conversion.