Parsing XML documents with validation

Validating an XML document determines whether the structure and content of the document conform to a set of rules. In Enterprise COBOL, the rules are expressed in an XML schema, which is essentially a blueprint for a class of documents.

To validate XML documents while parsing, use the VALIDATING phrase of the XML PARSE statement. To do so, you must compile your program using the XMLPARSE(XMLSS) compiler option.

You can validate XML documents only against an XML schema.

In Enterprise COBOL, a schema used for XML validation must be in a preprocessed format known as Optimized Schema Representation, or OSR. To generate a schema in OSR format from a text-form schema, use the z/OS® UNIX command xsdosrg, which invokes the OSR generator provided by z/OS System Services. (Alternatively, you can call the OSR generator programmatically. For details, see the related reference about z/OS XML System Services.)

For example, to convert the text-form schema in file item.xsd to a schema in preprocessed format in file item.osr, you can use the following z/OS UNIX command:


xsdosrg -v -o /u/HLQ/xml/item.osr /u/HLQ/xml/item.xsd
Use one of two forms of the VALIDATING phrase, depending on the location of the preprocessed schema:
  • In one form, you use the FILE keyword and specify an XML schema name. In this case, the schema must be in an MVS™ data set or a z/OS UNIX file.
  • In the other form, you specify the identifier of a data item that contains the schema.

If you use the FILE keyword and specify an XML schema name, the COBOL runtime library automatically retrieves the schema during execution of the XML PARSE statement. The following code fragment shows this method of specifying validation:


XML PARSE document-item
    VALIDATING WITH FILE schema-name
    PROCESSING PROCEDURE xml-event-handler
  ON EXCEPTION
    DISPLAY 'Document has an error.'
    GOBACK
  NOT ON EXCEPTION
    DISPLAY 'Document is valid.'
END-XML

To associate an XML schema name with the external file that contains the schema, code the XML-SCHEMA clause in the SPECIAL-NAMES paragraph, specifying either a literal or a user-defined word to identify the file.

For example, you can associate the XML schema name schema-name shown in the fragment above with the ddname DDSCHEMA by coding the ddname as a literal in the XML-SCHEMA clause as follows:


ENVIRONMENT DIVISION.
CONFIGURATION SECTION.
SPECIAL-NAMES.
    XML-SCHEMA schema-name IS 'DDSCHEMA'.

For running the program, you can associate ddname DDSCHEMA with the z/OS UNIX file item.osr by coding a DD statement as follows:


//GO.DDSCHEMA DD PATH='/u/HLQ/xml/item.osr'

Or you can use an analogous TSO ALLOCATE command.

Alternatively, DDSCHEMA in the example above could be the name of an environment variable that identifies the external file by means of a DSN option that specifies an MVS data set or a PATH option that specifies a z/OS UNIX file.

If your schema is in an MVS data set, the data set can be any sequential data set (for example, QSAM fixed blocked or variable blocked, or VSAM ESDS).

For further details about how to associate an XML schema name with the external file that contains the schema, see the related reference about the XML-SCHEMA clause.

Restriction: XML validation using the FILE keyword is not supported under CICS®.

The automatic retrieval that occurs when you use the FILE keyword is convenient. But if you have several XML documents of the same type to validate, reading the schema into memory once and then reusing the schema for each of the documents provides better performance than automatic retrieval. In this case, you use the other form of the VALIDATING phrase, in which you specify an identifier that references an alphanumeric data item that contains the XML schema. For example:


XML PARSE document-item
    VALIDATING WITH xmlschema
    PROCESSING PROCEDURE xml-event-handler
  ON EXCEPTION
    DISPLAY 'Document has an error.'
    GOBACK
  NOT ON EXCEPTION
    DISPLAY 'Document is valid.'
END-XML

Read the preprocessed schema into the data item, for example by using normal COBOL statements.

For more information about this form of the VALIDATING phrase, see the related reference about the XML PARSE statement.

During parsing with validation, normal XML events are returned until an exception occurs due to a validation error or well-formedness error. If an XML document is not valid, the parser signals an XML exception and passes control to the processing procedure with special register XML-EVENT containing 'EXCEPTION' and special register XML-CODE containing return code 24 in the high-order halfword and a specific validation reason code in the low-order halfword.

For information about the return code and reason code for exceptions that might occur when parsing XML documents with validation, see the related reference about exceptions with XMLPARSE(XMLSS) in effect.

Example: parsing XML documents with validation

related concepts  
XML-CODE
XML schemas  

related tasks    
Handling XML PARSE exceptions  

related references  
XMLPARSE (compiler option)  

XML PARSE exceptions with XMLPARSE(XMLSS) in effect
XML PARSE statement (Enterprise COBOL for z/OS Language Reference)  
XML-SCHEMA clause (Enterprise COBOL for z/OS Language Reference)  
z/OS XML System Services User's Guide and Reference