The encoding of XML documents

XML documents must be encoded in a supported code page.

XML documents generated in or parsed from national data items must be encoded in Unicode UTF-16 in big-endian format, CCSID 1200.

For XML GENERATE statements, documents generated in alphanumeric data items must be encoded in Unicode UTF-8 (CCSID 1208) or one of the single-byte EBCDIC encodings listed in the table below. You can use any CCSID from that table in the ENCODING phrase of the XML GENERATE statement.

For XML PARSE statements, documents in alphanumeric data items must be encoded as follows:

  • If XMLPARSE(XMLSS) is in effect:
    • If the RETURNING NATIONAL phrase is specified in the XML PARSE statement, in any EBCDIC or ASCII encoding that is supported by z/OS® Unicode Services for conversion to UTF-16
    • If the RETURNING NATIONAL phrase is not specified in the XML PARSE statement, in UTF-8 (CCSID 1208) or one of the single-byte EBCDIC encodings listed in the table below
  • If XMLPARSE(COMPAT) is in effect: in one of the single-byte EBCDIC encodings listed in the table below

If XMLPARSE(XMLSS) is in effect, you can use any supported CCSID (as described above for XML PARSE) in the ENCODING phrase of the XML PARSE statement.

Table 1. Coded character sets for XML documents
CCSID Description
1208 UTF-81
1047 Latin 1 / Open Systems
1140, 37 USA, Canada, . . . Euro Country Extended Code Page (ECECP), Country Extended Code Page (CECP)
1141, 273 Austria, Germany ECECP, CECP
1142, 277 Denmark, Norway ECECP, CECP
1143, 278 Finland, Sweden ECECP, CECP
1144, 280 Italy ECECP, CECP
1145, 284 Spain, Latin America (Spanish) ECECP, CECP
1146, 285 UK ECECP, CECP
1147, 297 France ECECP, CECP
1148, 500 International ECECP, CECP
1149, 871 Iceland ECECP, CECP
  1. Supported for the XML PARSE statement in the ENCODING phrase if XMLPARSE(XMLSS) is in effect

related concepts  
XML input document encoding
  

related references    
CODEPAGE

XMLPARSE (compiler option)