Specifying the encoding

You can choose how to specify the encoding for parsing an XML document that is in an alphanumeric data item.

About this task

The preferred way is to omit the encoding declaration from the document and to specify the encoding using one of the following means:

  • If XMLPARSE(XMLSS) is in effect: the ENCODING phrase of the XML PARSE statement, or the CODEPAGE compiler option
  • If XMLPARSE(COMPAT) is in effect: the CODEPAGE compiler option

Omitting the encoding declaration makes it possible to more easily transmit an XML document between heterogeneous systems. (If you included an encoding declaration, you would need to update it to reflect any code-page translation imposed by the transmission process.)

For XMLPARSE(COMPAT):

You can instead specify an encoding declaration in the XML declaration with which most XML documents begin. For example:


<?xml version="1.0" encoding="ibm-1140"?>

Note that the XML parser generates an exception if it encounters an XML declaration that does not begin in the first byte of an XML document.

If you specify an encoding declaration, do so in one of the following ways:
  • Specify the CCSID number (with or without any number of leading zeros) prefixed by one of the following strings in any mixture of uppercase and lowercase letters:
    • IBM®-
    • IBM_
    • CCSID-
    • CCSID_
  • Use one of the aliases listed in the following table. You can code the aliases in any mixture of uppercase and lowercase letters.
Table 1. Aliases for XML encoding declarations
CCSID Supported aliases
037 EBCDIC-CP-US, EBCDIC-CP-CA, EBCDIC-CP-WT, EBCDIC-CP-NL
500 EBCDIC-CP-BE, EBCDIC-CP-CH
1200 UTF-16
1208 UTF-8

For more information about the CCSIDs that are supported for XML parsing, see the related reference about the encoding of XML documents.

Related concepts  
XML input document encoding
  

Related references  
The encoding of XML documents