Storage of character data

Use the table below to compare alphanumeric (DISPLAY), DBCS (DISPLAY-1), and Unicode (NATIONAL) encoding and to plan storage usage.

Table 1. Encoding and size of alphanumeric, DBCS, and national data
Characteristic DISPLAY DISPLAY-1 NATIONAL
Character encoding unit 1 byte 2 bytes 2 bytes
Code page ASCII, EUC, UTF-8, or EBCDIC3 ASCII DBCS or EBCDIC DBCS3 UTF-16LE1
Encoding units per graphic character 1 1 1 or 22
Bytes per graphic character 1 byte 2 bytes 2 or 4 bytes
  1. National literals in your source program are converted to UTF-16 for use at run time.
  2. Most characters are represented in UTF-16 using one encoding unit. In particular, the following characters are represented using a single UTF-16 encoding unit per character:
    • COBOL characters A-Z, a-z, 0-9, space, + - * / = $ , ; . " ( ) > < :'
    • All characters that are converted from an EBCDIC, ASCII, or EUC code page
  3. Depending on the locale, the CHAR(NATIVE) or CHAR(EBCDIC) option, and the EBCDIC_CODEPAGE environment variable settings

Related references  
CHAR