CODEPAGE

Use CODEPAGE to specify the coded character set identifier (CCSID) for an EBCDIC code page for processing compile-time and runtime COBOL operations that are sensitive to character encoding.

CODEPAGE option syntax

Read syntax diagramSkip visual syntax diagramCODEPAGE( ccsid)

Default is: CODEPAGE(1140)

Abbreviations are: CP(ccsid)

ccsid must be an integer that represents a valid CCSID for an EBCDIC code page.

The default CCSID 1140 is the equivalent of CCSID 37 (COM EUROPE EBCDIC), but additionally includes the euro symbol.

ccsid specifies these encodings:
  • The encoding for alphanumeric, national, and DBCS literals in a COBOL source program
  • The default encoding of the content of alphanumeric and DBCS data items at run time
  • The encoding for DBCS user-defined words when processed by an XML GENERATE statement to create XML element and attribute names
  • The default encoding of an XML document created by an XML GENERATE statement if the receiving data item for the document is alphanumeric
  • The default encoding assumed for an XML document in an alphanumeric data item when the document is processed by an XML PARSE statement
The CODEPAGE ccsid is used when code-page-sensitive operations are performed at compile time or run time, and an explicit CCSID that overrides the default code page is not specified. Such operations include:
  • Conversion of literal values to Unicode
  • Conversion of alphanumeric data to and from national (Unicode) data as part of move operations, comparison, or the intrinsic functions DISPLAY-OF and NATIONAL-OF
  • Object-oriented language such as INVOKE statements or class definitions and method definitions
  • XML parsing
  • XML generation
  • Processing of DBCS names as part of XML generation at run time
  • Processing of SQL string host variables if the SQLCCSID option is in effect
  • Processing of source code for EXEC SQL statements
  • Processing of source code for EXEC SQLIMS statements

However, the encoding of the following items in a COBOL source program is not affected by the CODEPAGE compiler option:

  • Data items that have USAGE NATIONAL

    These items are always encoded in UTF-16 in big-endian format, CCSID 1200.

  • Characters from the basic COBOL character set (see the table of these characters in the related reference below about characters)

    Though the encoding of the basic COBOL characters default currency sign ($), quotation mark ("), and the lowercase Latin letters varies in different EBCDIC code pages, the compiler always interprets these characters using the EBCDIC code page 1140 encoding. In particular, the default currency sign is always the character with value X'5B' (unless changed by the CURRENCY compiler option or the CURRENCY SIGN clause in the SPECIAL-NAMES paragraph), and the quotation mark is always the character with value X'7F'.

Some COBOL operations can override the CODEPAGE ccsid by using an explicit encoding specification, for example:
  • DISPLAY-OF and NATIONAL-OF intrinsic functions that specify a code page as the second argument
  • XML PARSE statements that specify the WITH ENCODING phrase
  • XML GENERATE statements that specify the WITH ENCODING phrase
Additionally, you can use the CURRENCY compiler option or the CURRENCY SIGN clause in the SPECIAL-NAMES paragraph to override:
  • The default currency symbol used in the PICTURE character-strings for numeric-edited data items in your source program
  • The currency sign value used in the content of numeric-edited data items at run time

DBCS code pages:

Compile your COBOL program using the CODEPAGE option with the ccsid set to one of the EBCDIC multibyte character set (MBCS) CCSIDs shown in the table below if the program contains any of the following items:
  • User-defined words formed with DBCS characters
  • DBCS (USAGE DISPLAY-1) data items
  • DBCS literals

All of the CCSIDs in the table below identify mixed code pages that refer to a combination of SBCS and DBCS coded character sets. These are also the CCSIDs that are supported for mixed data by DB2®.

Table 1. EBCDIC multibyte coded character set identifiers
National language MBCS CCSID SBCS CCSID component DBCS CCSID component
Japanese (Katakana-Kanji) 930 290 300
Japanese (Katakana-Kanji with euro) 1390 8482 16684
Japanese (Katakana-Kanji) 5026 290 4396
Japanese (Latin-Kanji) 939 1027 300
Japanese (Latin-Kanji with euro) 1399 5123 16684
Japanese (Latin-Kanji) 5035 1027 4396
Korean 933 833 834
Korean 1364 13121 4930
Simplified Chinese 935 836 837
Simplified Chinese 1388 13124 4933
Traditional Chinese 937 28709 835
Note: If you specify the TEST option, you must set the CODEPAGE option to the CCSID that is used for the COBOL source program. In particular, programs that use Japanese characters in DBCS literals or DBCS user-defined words must be compiled with the CODEPAGE option set to a Japanese codepage CCSID.

related concepts  
COBOL and DB2 CCSID determination

related references  
CURRENCY
SQLCCSID
TEST

The encoding of XML documents
Characters (Enterprise COBOL for z/OS® Language Reference)