Parsing XML documents

To parse XML documents, use the XML PARSE statement, specifying the XML document that is to be parsed and the processing procedure for handling XML events that occur during parsing, as shown in the following code fragment.


XML PARSE xml-document
    PROCESSING PROCEDURE xml-event-handler
  ON EXCEPTION
     DISPLAY 'XML document error ' XML-CODE
     STOP RUN
  NOT ON EXCEPTION
     DISPLAY 'XML document was successfully parsed.'
END-XML
In the XML PARSE statement, you first identify the parse data item (xml-document in the example above) that contains the XML document character stream. In the DATA DIVISION, define the parse data item as an elementary data item of category national or as a national group item if the encoding of the document is Unicode UTF-16; otherwise, define the parse data item as an elementary alphanumeric data item or an alphanumeric group item:
  • If the parse data item is national, the XML document must be encoded in UTF-16, CCSID 1200.
  • If the parse data item is alphanumeric, its content must be encoded in one of the supported code pages described in the related reference about the encoding of XML documents.

Next, specify the name of the processing procedure (xml-event-handler in the example above) that is to handle the XML events that occur during parsing of the document.

If the XMLPARSE(XMLSS) compiler option is in effect, you can also use any of these optional phrases of the XML PARSE statement:
  • ENCODING, to specify the CCSID of the document
  • RETURNING NATIONAL, to cause the parser to automatically convert UTF-8 or single-byte characters to national characters for return to the processing procedure
  • VALIDATING, to cause the parser to validate the document against an XML schema

In addition, you can specify either or both of the following optional phrases (as shown in the fragment above) to indicate the action to be taken after parsing finishes:

  • ON EXCEPTION, to receive control if an unhandled exception occurs during parsing
  • NOT ON EXCEPTION, to receive control otherwise

You can end the XML PARSE statement with the explicit scope terminator END-XML. Use END-XML to nest an XML PARSE statement that uses the ON EXCEPTION or NOT ON EXCEPTION phrase in a conditional statement.

The parser passes control to the processing procedure for each XML event. Control returns to the parser at the end of the processing procedure. This exchange of control between the XML parser and the processing procedure continues until one of the following events occurs:

  • The entire XML document was parsed, as indicated by the END-OF-DOCUMENT event.
  • If XMLPARSE(XMLSS) is in effect, either:
    • The parser detects an error in the document and signals an EXCEPTION event (regardless of the kind of exception).
    • The parser signals an END-OF-INPUT event, and the processing procedure returns to the parser with special register XML-CODE still set to zero, which indicates that no further XML data will be provided to the parser.
  • If XMLPARSE(COMPAT) is in effect, either:
    • The parser signals an encoding conflict EXCEPTION event, and the processing procedure does not reset special register XML-CODE to zero or to the correct CCSID before returning to the parser.
    • The parser detects an error in the document and signals an EXCEPTION event (other than an encoding conflict), and the processing procedure does not reset special register XML-CODE to zero before returning to the parser.
  • The parsing process is terminated deliberately by the user's code in the processing procedure that sets the XML-CODE special register to -1 before it returns to the parser.

related concepts  
XML events  
XML-CODE  
XML schemas
XML-INFORMATION
    

related references  
XMLPARSE (compiler option)
    
The encoding of XML documents  
XML PARSE exceptions with XMLPARSE(XMLSS) in effect
  
XML PARSE exceptions with XMLPARSE(COMPAT) in effect
  
XML PARSE statement (Enterprise COBOL for z/OS Language Reference)