Storage of character data

Use the table below to compare alphanumeric (DISPLAY), DBCS (DISPLAY-1), and Unicode (NATIONAL) encoding and to plan storage usage.

Table 1. Encoding and size of alphanumeric, DBCS, and national data
Characteristic DISPLAY DISPLAY-1 NATIONAL
Character encoding unit 1 byte 2 bytes 2 bytes
Code page EBCDIC EBCDIC DBCS UTF-16BE1
Encoding units per graphic character 1 1 1 or 22
Bytes per graphic character 1 byte 2 bytes 2 or 4 bytes
  1. Use the CODEPAGE compiler option to specify the EBCDIC code page that is applicable to alphanumeric or DBCS data.
  2. Most characters are represented in UTF-16 using one encoding unit. In particular, the following characters are represented using a single UTF-16 encoding unit per character:
    • COBOL characters A-Z, a-z, 0-9, space, + - * / = $ , ; . " ( ) > < :'
    • All characters that are converted from an EBCDIC or ASCII code page