Generating XML output

To transform COBOL data to XML, use the XML GENERATE statement as in the example below.

XML GENERATE XML-OUTPUT FROM SOURCE-REC
       COUNT IN XML-CHAR-COUNT
   ON EXCEPTION
       DISPLAY 'XML generation error ' XML-CODE
       STOP RUN
   NOT ON EXCEPTION
       DISPLAY 'XML document was successfully generated.'
END-XML

In the XML GENERATE statement, you first identify the data item (XML-OUTPUT in the example above) that is to receive the XML output. Define the data item to be large enough to contain the generated XML output, typically five to 10 times the size of the COBOL source data depending on the length of its data-name or data-names.

In the DATA DIVISION, you can define the receiving identifier as alphanumeric (either an alphanumeric group item or an elementary item of category alphanumeric) or as national (either a national group item or an elementary item of category national).

Next you identify the source data item that is to be transformed to XML format (SOURCE-REC in the example). The source data item can be an alphanumeric group item, national group item, or elementary data item of class alphanumeric or national.

Some COBOL data items are not transformed to XML, but are ignored. Subordinate data items of an alphanumeric group item or national group item that you transform to XML are ignored if they:

  • Specify the REDEFINES clause, or are subordinate to such a redefining item
  • Specify the RENAMES clause

These items in the source data item are also ignored when you generate XML:

  • Elementary FILLER (or unnamed) data items
  • Slack bytes inserted for SYNCHRONIZED data items

No extra white space (for example, new lines or indentation) is inserted to make the generated XML more readable.

Optionally, you can code the COUNT IN phrase to obtain the number of XML character encoding units that are filled during generation of the XML output. If the receiving identifier has category national, the count is in UTF-16 character encoding units. For all other encodings (including UTF-8), the count is in bytes.

You can use the count field as a reference modification length to obtain only that portion of the receiving data item that contains the generated XML output. For example, XML-OUTPUT(1:XML-CHAR-COUNT) references the first XML-CHAR-COUNT character positions of XML-OUTPUT.

Consider the following program excerpt:

01  doc pic x(512).
01  docSize pic 9(9) binary.
01  G. 
    05  A pic x(3) value "aaa". 
    05  B. 
        10  C pic x(3) value "ccc". 
        10  D pic x(3) value "ddd". 
    05  E pic x(3) value "eee".
    . . .
    XML Generate Doc from G

The code above generates the following XML document, in which A, B, and E are expressed as child elements of element G, and C and D become child elements of element B:

<G><A>aaa</A><B><C>ccc</C><D>ddd</D></B><E>eee</E></G>

Alternatively, you can specify the ATTRIBUTES phrase of the XML GENERATE statement. The ATTRIBUTES phrase causes every eligible data item included in the generated XML document to be expressed as an attribute of the containing XML element, rather than as a child element of the containing XML element. To be eligible, the data item must be elementary, must have a name other than FILLER, and must not have an OCCURS clause in its data description entry. The containing XML element corresponds to the group data item that is immediately superordinate to the elementary data item. Optionally, you can specify more precise control of which data items should be expressed as attributes or elements by using the TYPE OF phrase.

For example, suppose that the XML GENERATE statement in the program excerpt above had instead been coded as follows:

XML Generate Doc from G with attributes 

The code would then generate the following XML document, in which A and E are expressed as attributes of element G, and C and D become attributes of element B:

<G A="aaa" E="eee"><B C="ccc" D="ddd"></B></G>

Optionally, you can code the ENCODING phrase of the XML GENERATE statement to specify the CCSID of the generated XML document. If you do not use the ENCODING phrase, the document encoding is determined by the category of the receiving data item and by the CODEPAGE compiler option. For further details, see the related task below about controlling the encoding of generated XML output.

Optionally, you can code the XML-DECLARATION phrase to cause the generated XML document to have an XML declaration that includes version information and an encoding declaration. If the receiving data item is of category:

  • National: The encoding declaration has the value UTF-16 (encoding="UTF-16").
  • Alphanumeric: The encoding declaration is derived from the ENCODING phrase, if specified, or from the CODEPAGE compiler option in effect for the program if the ENCODING phrase is not specified.

For example, the program excerpt below specifies the XML-DECLARATION phrase of XML GENERATE, and specifies encoding with CCSID 1208 (UTF-8):

01  Greeting. 
    05 msg  pic x(80)  value 'Hello, world!'. 
    . . .
    XML Generate Doc from Greeting 
        with Encoding 1208 
        with XML-declaration 
    End-XML

The code above generates the following XML document:

<?xml version="1.0" encoding="UTF-8"?><Greeting><msg>Hello, world!</msg></Greeting> 

If you do not code the XML-DECLARATION phrase, an XML declaration is not generated.

Optionally, you can code the NAMESPACE phrase to specify a namespace for the generated XML document. The namespace value must be a valid Uniform Resource Identifier (URI), for example, a URL (Uniform Resource Locator); for further details, see the related concept about URI syntax below.

Specify the namespace in an identifier or literal of either category national or alphanumeric.

If you specify a namespace, but do not specify a namespace prefix (described below), the namespace becomes the default namespace for the document. That is, the namespace define on the root element applies by default to each element name in the document, including the root element.

For example, consider the following data definitions and XML GENERATE statement:

01  Greeting. 
    05  msg  pic x(80)  value 'Hello, world!'. 
01  NS  pic x(20)   value 'http://example'.
    . . .
    XML Generate Doc from Greeting
        namespace is NS

The resulting XML document has a default namespace (http://example), as follows:

<Greeting xmlns="http://example"><msg>Hello, world!</msg></Greeting> 

If you do not specify a namespace, the element names in the generated XML document are not in any namespace.

Optionally, you can code the NAMESPACE-PREFIX phrase to specify a prefix to be applied to the start and end tag of each element in the generated document. You can specify a prefix only if you have specified a namespace as described above.

When the XML GENERATE statement is executed, the prefix value must be a valid XML name, but without the colon (:); see the related reference below about namespaces for details. The value can have trailing spaces, which are removed before the prefix is used.

Specify the namespace prefix in an identifier or literal of either category national or alphanumeric.

It is recommended that the prefix be short, because it qualifies the start and end tag of each element.

For example, consider the following data definitions and XML GENERATE statement:

01  Greeting. 
    05  msg  pic x(80)  value 'Hello, world!'. 
01  NS  pic x(20)   value 'http://example'. 
01  NP  pic x(5)    value 'pre'. 
    . . .
    XML Generate Doc from Greeting
        namespace is NS
        namespace-prefix is NP

The resulting XML document has an explicit namespace (http://example), and the prefix pre is applied to the start and end tag of the elements Greeting and msg, as follows:

<pre:Greeting xmlns:pre="http://example"><pre:msg>Hello, world!</pre:msg></pre:Greeting> 

Optionally, you can code the NAME phrase to specify attribute and element names in the generated XML document. The attribute and element names must be alphanumeric or national literals and must be legal names according to the XML 1.0 standard.

For example, consider the following data structure and XML GENERATE statement:

01 Msg.
    02 Msg-Severity pic 9 value 1.
    02 Msg-Date pic 9999/99/99 value "2012/04/12".
    02 Msg-Text pic X(50) value "Sell everything!".
01 Doc pic X(500).
    XML Generate Doc from Msg
        With attributes
        Name of Msg          is  "Message" 
                Msg-Severity is  "Severity"
                Msg-Date     is  "Date"
                Msg-Text     is  "Text"
    End-XML

The resulting XML document is as follows:

<Message Severity="1" Date="2012/04/12" Text="Sell everything!"></Message> 

Optionally, you can code the SUPPRESS phrase to specify whether individual data items are generated based on whether or not they meet certain criteria.

For example, consider the following data structure and XML GENERATE statement to suppress spaces and zeros:

01 G.
    02 SensitiveInfo.
        03 SSN pic x(11) value '123-45-6789'.
        03 HomeAddress pic x(50) value '123 Main St, Anytown, USA'.   
    02 Aarray value spaces.
        03 A pic AAA occurs 5.
    02 Barray value spaces.
        03 B pic XXX occurs 5.
    02 Carray value zeros.
        03 C pic 999 occurs 5.
    Move 'abc' to A(1)
    Move 123 to C(3)
    XML Generate Doc from G
        Suppress SensitiveInfo
                 every nonnumeric element when space
                 every numeric element when zero
    End-XML

The resulting XML document is as follows:

<G>
   <Aarray><A>abc</A></Aarray>
   <Carray><C>123</C></Carray>
</G>

Optionally, you can use the TYPE OF phrase to specify whether individual data items are expressed as attributes, elements or content.

For example, consider the following data structure and XML GENERATE statement:

01 Msg.
   02 Msg-Severity pic 9 value 1.
   02 Msg-Date pic 9999/99/99 value "2012/04/12".
   02 Msg-Text pic X(50) value "Sell everything!".
01 Doc pic X(500).
    XML Generate Doc from Msg
        With attributes
        Type of Msg-Severity is attribute
                Msg-Date     is attribute
                Msg-Text     is element
    End-XML

The resulting XML document is as follows:

<Msg Msg-Severity="1" Msg-Date="2012/04/12"> 
       <Msg-Text>Sell everything!</Msg-Text></Msg>
In addition, you can specify either or both of the following phrases to receive control after generation of the XML document:
  • ON EXCEPTION, to receive control if an error occurs during XML generation
  • NOT ON EXCEPTION, to receive control if no error occurs

You can end the XML GENERATE statement with the explicit scope terminator END-XML. Code END-XML to nest an XML GENERATE statement that has the ON EXCEPTION or NOT ON EXCEPTION phrase in a conditional statement.

XML generation continues until either the COBOL source record has been transformed to XML or an error occurs. If an error occurs, the results are as follows:

  • The XML-CODE special register contains a nonzero exception code.
  • Control is passed to the ON EXCEPTION phrase, if specified, otherwise to the end of the XML GENERATE statement.

If no error occurs during XML generation, the XML-CODE special register contains zero, and control is passed to the NOT ON EXCEPTION phrase if specified or to the end of the XML GENERATE statement otherwise.

Example: generating XML

related references    
XML GENERATE statement (Enterprise COBOL for z/OS® Language Reference)  
Extensible Markup Language (XML)
Namespaces in XML 1.0