Troubleshooting
Problem
This document provides information about why COBOL XML PARSE does not support UTF-8.
Resolving The Problem
Why does COBOL XML PARSE not support UTF-8?
Symptom: When trying to parse a file in UTF-8, the user receives error 318 stating that UTF-8 is not supported.
The XML parser used by COBOL on IBM iSeries the same parser used by COBOL on IBM zSeries, and this parser does not support UTF-8. However, most XML documents that say they are encoded in UTF-8 are in fact coded in a subset of UTF-8, which is ASCII 819, and this CCSID is supported by the XML parser. The XML parser also supports Unicode UCS-2 (CCSID 13488), but this is only needed if the XML contains characters that are not available in CCSID 819.
The following steps can be used to send an UTF-8 XML document to iSeries for use with COBOL's XML PARSE (assuming that the CCSID of the job the COBOL program is running in is 37):
Symptom: When trying to parse a file in UTF-8, the user receives error 318 stating that UTF-8 is not supported.
The XML parser used by COBOL on IBM iSeries the same parser used by COBOL on IBM zSeries, and this parser does not support UTF-8. However, most XML documents that say they are encoded in UTF-8 are in fact coded in a subset of UTF-8, which is ASCII 819, and this CCSID is supported by the XML parser. The XML parser also supports Unicode UCS-2 (CCSID 13488), but this is only needed if the XML contains characters that are not available in CCSID 819.
The following steps can be used to send an UTF-8 XML document to iSeries for use with COBOL's XML PARSE (assuming that the CCSID of the job the COBOL program is running in is 37):
1. | FTP the XML document to your iSeries Integrated File System directory. |
2. | On the operating system command line, type the following command: CPY OBJ('mydir/myxml.xml') TOOBJ('mydir/myxml37.xml') TOCCSID(37) DTAFMT(*TEXT) Press the Enter key. |
3. | Edit myxml37 and change the encoding at the top of the document to encoding="ibm-37". |
Depending on the contents of the XML document and how the XML document has been formatted, it might also be necessary to remove characters preceding the '<' at the start of each XML record. It might also be necessary to ensure that each line in the Integrated File System XML document ends only with a CR (Carriage Return) and not with a LF (line feed).
ILE COBOL Programmer's Guide
Understanding XML document encoding > Specifying the code page
Understanding XML document encoding > Parsing documents in other code pages
Understanding XML document encoding > Parsing documents in other code pages
[{"Type":"MASTER","Line of Business":{"code":"LOB68","label":"Power HW"},"Business Unit":{"code":"BU070","label":"IBM Infrastructure"},"Product":{"code":"SWG60","label":"IBM i"},"ARM Category":[{"code":"a8m0z0000000CHtAAM","label":"Programming ILE Languages"},{"code":"a8m3p000000F98bAAC","label":"Programming ILE Languages-\u003ECOBOL"}],"ARM Case Number":"","Platform":[{"code":"PF012","label":"IBM i"}],"Version":"All Versions"}]
Historical Number
405919982
Was this topic helpful?
Document Information
More support for:
IBM i
Component:
Programming ILE Languages, Programming ILE Languages->COBOL
Software version:
All Versions
Operating system(s):
IBM i
Document number:
637751
Modified date:
04 April 2025
UID
nas8N1015124
Manage My Notification Subscriptions