Build a pureXML application in DB2 for z/OS, Part 1: Use COBOL stored procedures to store and manipulate data

In this article, set up and populate a small XML database, and implement COBOL stored procedures for basic operations on the data. Use the included download bundle with ready-to-install samples to explore the concepts of this article, and apply the concepts to other industry or custom XML formats. The XML documents that conform to the UBL (Universal Business Language) format from OASIS are stored and manipulated in the samples.

Kirsten A. Larsen (kila@dk.ibm.com), Senior IT specialist, IBM

Kirsten A. LarsenKirsten Ann Larsen is a senior IT specialist and technical lead with IT Delivery in IBM Nordics. She has more than 10 years of experience with DB2 for z/OS and has co-authored a Redbook on DB2 security. She has worked with XML since pureXML support was included with the release of DB2 9 in 2007.



Jason Cu (jasoncu@us.ibm.com), Senior Software Engineer, IBM

Jason CuJason Cu is a senior software engineer in the DB2 for z/OS organization at the IBM Silicon Valley Laboratory. He has worked with DB2 since 1998 after joining IBM with a bachelor's degree in computer science from UC Berkeley. During that time, Jason has worked on SQL development projects until joining the XML team in 2004, and he is now the XPath development team leader.



Susan Malaika, Senior Technical Staff Member, IBM

Susan Malaika photoSusan Malaika works in IBM's Information Management Group. She specializes in XML and Web technologies, including Grid computing. She has published articles and co-edited a book on the Web. She is a member of the IBM Academy of Technology.



Michael Schenker (mschenk@us.ibm.com), Software Engineer, IBM

Michael SchenkerMichael Schenker is a software engineer at IBM's Silicon Valley Laboratory in San Jose, Calif. He joined IBM in 2002 and works in the IBM Data Server Tooling area. His subject of expertise is the Web service enablement of IBM's data servers. He holds a Master's degree in computer sciences from the University of Applied Sciences in Leipzig, Germany.



14 May 2009

Also available in Chinese Spanish

Introduction

This is the first installment of a series that discusses how to get started with XML in IBM® DB2® for z/OS®. The series provides different approaches for creating simple applications that manipulate XML data. The objective is to provide some familiarity with the technology and demonstrate the ease with which XML applications can be developed. The downloadable material may also serve as a starting point for prototyping.

With the popularity of SOA and Web-based technologies, there are increasing amounts of XML data coming in and out of computer systems. The DB2 9 for z/OS support for pureXML® makes it much simpler to store and manipulate XML on System z® through SQL/XML interfaces. This article describes how to create COBOL stored procedures in two categories:

  • Basic operations: insert, update, and delete XML with DB2 for z/OS
  • Complex operations: select XML specifying an XPath, shred XML into relational tables, and compose XML from relational data

Samples for all the stored procedures are provided in an accompanying bundle (see Download). The article also discusses some more advanced aspects, for example:

  • Data conversions when transmitting XML data to and from stored procedures in DB2 for z/OS.
  • The use of XPath to filter data in XMLEXISTS

System requirements

In order to install and run the COBOL samples, you need to have DB2 9 for z/OS and IBM Enterprise COBOL for z/OS 4.1.0.

Download bundle

The download bundle, included in the Download section, contains the following:

  • COBOL source code
  • A list of recommended compiler options
  • DDL for eight sample stored procedures
  • DDL for tables used by the stored procedures
  • XML schema documents
  • A script for registering the schema in DB2
  • Sample XML documents
  • A Java technology program that demonstrates how to call the stored procedures
  • A README.txt file with instructions on how to deploy the COBOL stored procedures

Figure 1 illustrates the download bundle content:

Figure 1. Download bundle content
Diagram of download bundle content as it would appear in Windows Explorer

Industry bundles

The stored procedures supplied with this article are based on the UBL (Universal Business Language) 2.0 format. (For more information about UBL, see the link to OASIS Universal Business Language Technical Committee Web page in the Resources section.) Industry bundles for other industry formats with DB2 pureXML containing table definitions, schemas, sample documents, and queries are available for download in the article "Build a DB2 pureXML application in a day" (developerWorks, December 2008). You can read about how the samples can be used as the basis for building prototype applications in the "XML Schema Tutorial" (see Resources).


A few basic XML concepts

This article assumes you have some familiarity with XML; but in order to better demonstrate the purpose of each stored procedure, let's establish a few concepts.

Recall that a well-formed XML document is the textual representation of a tree. The root of the tree is the document itself, and each node corresponds to an element or character data. An element can consist of a sequence of elements, in which case the corresponding tree node has child nodes, or character data, in which case the node is a leaf node. An element can have attributes that also get mapped to child nodes; namespaces are a special type of attribute used to qualify names to avoid name clashes between elements of different XML formats.

Listing 1 illustrates an XML document with (simplified) format UBL 2.0:

Listing 1. XML document as shown in a browser
<Invoice xmlns:cac=
  "urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2"
  xmlns:cbc=
  "urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2"
  xmlns="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2">
  <cac:AccountingCustomerParty>
    <cbc:CustomerAssignedAccountID>XFB01</cbc:CustomerAssignedAccountID> 
    <cac:Party>
      <cac:PartyName>
        <cbc:Name>IYT Corporation</cbc:Name> 
      </cac:PartyName>
      <cac:PostalAddress>
        <cbc:StreetName>Avon Way</cbc:StreetName> 
        <cbc:PostalZone>ZZ99 1ZZ</cbc:PostalZone> 
        <cac:AddressLine>
          <cbc:Line>3rd Floor, Room 5</cbc:Line> 
        </cac:AddressLine>
        <cac:Country>
          <cbc:IdentificationCode>GB</cbc:IdentificationCode> 
        </cac:Country>
      </cac:PostalAddress>
    <cac:Party>
  </cac:AccountingCustomerParty>
  </Invoice>

Figure 2 shows the same document presented as a tree:

Figure 2. XML document as a tree
Graphical depiction of XML document as a tree (Listing 1) using circles and lines

In Figure 2, the element nodes are depicted in blue and include:

  • Invoice
  • cac:AccountingCustomerParty
  • cbc:CustomerAssignedAccountID
  • cac:Party
  • cac:PartyName
  • cbc:Name
  • cac:PostalAddress
  • cbc:StreetName
  • cbc:PostalZone
  • cac:AddressLine
  • cbc:Line
  • cac:Country
  • cbc:IdentificationCode

The attribute nodes are depicted in red and include:

  • xmlns:cac
  • xmlns:cbc
  • xmlns:Invoice

The character data nodes are depicted in black and include:

  • XFB01
  • IYT Corporation
  • Avon Way
  • ZZ99 1ZZ
  • 3rd Floor, Room 5
  • GB

Note that all element names are prefixed with one of the namespaces declared as attributes.


Sample database

The article samples (see Download) use a subset of the XML format UBL 2.0 (Universal Business Language). UBL 2.0 is a standard for electronic commerce developed by OASIS and is quite comprehensive as real-life data usually is. This article uses the subset related to invoices, which requires 11 XML schema documents. For more information on OASIS and UBL 2.0, refer to the OASIS Universal Business Language (UBL) Technical Committee Web page (see Resources).

The database consists of two tables. The main table is UBLADMIN.UBL, which holds the XML data; the other is a small, specialized table used to hold data shredded from the documents. The second table is primarily for demonstration purposes, whereas the first table, simple as it is, might be sufficient for real-life applications.

Listing 2. Columns of table UBLADMIN.UBL
ID          INTEGER GENERATED ALWAYS AS IDENTITY
COMMENT     VARCHAR(1000)
DOCUMENT    XML

The DDL for the tables is supplied in Tables.txt (included in the available download).


Sample stored procedures

The stored procedures are written in COBOL. They are kept as simple as possible to better illustrate what is actually needed for working with XML in DB2 for z/OS.

The parameter type used for XML documents in all the stored procedures is CLOB. At present, stored procedures and user-defined functions do not have support for XML as a parameter, but they can easily be passed as CLOBs.

Basic operations

This article supplies three basic operations for XML documents: insert, update, and delete. These are general operations and can be used for most applications, even for XML-based applications on a schema format other than UBL 2.0, because there is no application- or format-specific code in the programs themselves. When an XML schema is used, it is passed as a parameter.

For the operations insert and update, this article provides two flavours of the stored procedures: with and without schema validation.

Schema validation is the process of checking that an XML document adheres to the rules and format defined by an XML schema. These rules describe, for example, which elements are valid, the order of elements within a document, the data types of attributes, whether an element is required or optional, and so on. For more information on XML schema, refer to the "XML Schema tutorial" (see Resources).

When inserting or updating without schema validation, the document must be well-formed.

Validation of the documents requires that the schema be registered to DB2. This article supplies the schema for UBL 2.0 Invoice, which consists of 11 individual schema documents. This article also supplies the script Invoice.clp, which registers the schema SYSXSR.UBL_INVOICE_2_0 as consisting of these documents through the DB2 command line processor.

The XML document samples provided in the Samples folder (see Download) are all valid according to this schema.

The stored procedures without schema validation are INSXML, UPDXML, and DELXML, for insert, update, and delete respectively. The stored procedures using schema validation are INSVLXML and UPDVLXML, for insert and update respectively.

Complex operations

Additionally, this article supplies three stored procedures for more complex operations on XML data. These operations are not generally applicable as they are application- and format-specific, and have to be tailored in each case. They could serve as examples that may be transferred to other areas and uses.

Each of the tailored stored procedures uses a different DB2 XML construct. This article briefly goes through these constructs, but refer to the "DB2 Version 9.1. for z/OS, XML Guide" (IBM Corporation, December 2008) for a detailed description (see Resources).

SELXML

The stored procedure SELXML queries data inside an XML document using XPath, which is a W3C-recommended standard for accessing data in XML documents and has been incorporated into DB2.

XPath provides primitives for moving around the XML tree, for selecting nodes according to name or by the use of a Boolean expression, and for applying functions to a sequence of nodes. It takes an XML document as input and returns a sequence of nodes or the result of applying a function to a sequence of nodes. For more details on XPath 2.0, see "XML Path Language (XPath) 2.0" (W3C, January 2007).

In DB2 an XPath expression is wrapped in a function (XMLQUERY or XMLEXISTS), which is applied to an XML column and returns something of type XML. The output from the function can then be used as input to another function taking XML arguments, or it can be cast to an equivalent SQL data type. For more details, see "DB2 Version 9.1. for z/OS, XML Guide" (see Resources).

The XPath expression used in SELXML is shown in Listing 3:

Listing 3. XPath expression used in SELXML
declare default element namespace 
  'urn:oasis:names:specification:ubl:schema:xsd:Invoice-2';
declare namespace cbc=
  'urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2';
declare namespace cac=
  'urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2';
  /Invoice[cac:PaymentMeans/cbc:PaymentDueDate=$PAYMENTDATE]
      /cac:LegalMonetaryTotal
      /cbc:PayableAmount

The first part of the expression contains the namespace declarations (in this case, UBL 2.0 Invoice) and some auxiliary namespaces; these are used to ensure uniqueness of all element names used.

The second part of the XPath expression traverses the XML tree through the nodes Invoice and LegalMonetaryTotal to PayableAmount, which is returned if the Boolean expression in the square brackets is true. The Boolean expression says that the PaymentDueDate, which must be a child node of PaymentMeans, which must be child node of Invoice, must equal $PAYMENTDATE, which is a value supplied by a host variable.

Put simply, it returns the payable amount if the payment due date matches the one supplied in the host variable. This XPath expression is used with other SQL in the stored procedure to return the sum of the payable amount due on the date supplied as input.

SHREDXML

The stored procedure SHREDXML shreds an XML document to relational data using the DB2 function XMLTABLE, which specifies an XPath to navigate the XML document and returns the result in the form of a table. This could be used for integration with relational data.

XMLTABLE expressions can be used in many different ways. The SHREDXML is built from the following three components:

  • Namespace declarations for UBL 2.0 Invoice and auxiliary namespaces.
  • A context item, which is an XPath expression used to point to a node that will be the starting point for the column expressions, namely AccountingCustomerParty.
  • Column definitions for desired columns of the table. These include name, data type, and an XPath expression defining the column content using the context item as starting point. The definitions for columns include ACCOUNT_ID, NAME, CONT_PERSON, PHONE, and EMAIL.

Doing so creates a relational view on the UBL 2.0 Invoice documents that gives contact details for customers.

The table UBLADMIN.CUSTOMER in the supplied DDL matches the format of the output from this stored procedure.

TAGXML

The stored procedure TAGXML composes an XML document from relational data using XMLELEMENT and other similar DB2 functions known as publishing functions. The publishing functions construct XML values from SQL values or other XML values. Table 1 shows some of the most common publishing functions:

Table 1. Common publishing functions
NameDescription
XMLDOCUMENTReturns an XML value with a single document node with zero or more children nodes.
XMLNAMESPACESConstructs namespace declarations from the arguments.
XMLELEMENTReturns an XML value that is an XML element node.
XMLATTRIBUTESConstructs XML attributes from the arguments.
XMLCONCATReturns a sequence containing the concatenation of a variable number of XML input arguments.
XMLFORESTReturns an XML value that is a sequence of XML element nodes.

The publishing functions are dual to the XMLTABLE function.

To emphasize this, this articles uses TAGXML to perform the exact opposite function from SHREDXML, (in other words, it composes an XML element with customer contact details from a relational table). The table UBLADMIN.CUSTOMER is the starting point to compose an XML element with the same internal structure as the corresponding subset of an UBL 2.0 invoice.


Deploying the stored procedures

Complete the deployment of the stored procedures in five easy steps:

  1. Create tables in DB2 by running the DDL supplied in Tables.txt (see Download), using SPUFI, for example.
  2. Compile, link-edit, and deploy each of the COBOL load modules, and bind the DBRM using whatever procedure you have in place on your system. The COBOL source code for the eight stored procedures is available in the sub-folder COBOL source together with the file Compiler options.txt that lists recommended compiler options (see Download). Please note that options QUOTE and QUOTESQL are required.
  3. Create the stored procedures by running the DDL supplied in SP.txt (see Download), using SPUFI, for example. This file contains the DDL for all of the stored procedures; if you do not want to implement them all, you may want to remove them from the file before implementation. Before running, perform the following modifications:
    1. Change the WLM environment to the one that runs COBOL stored procedures at your site. (Current name is DSN9WL4K.)
    2. Change the collection id to the one you will be using for the stored procedures. (Current name is UBLADMIN.)
  4. Register the XML schema UBL 2.0 Invoice by running script Invoice.clp through the DB2 command line processor on your Windows machine:
    1. Open the DB2 command line processor from your laptop by running db2cmd from the start menu.
    2. Connect to the DB2 subsystem where you wish to register the XML schema.
    3. Navigate to the Schema folder.
    4. Run the commands in the file Invoice.clp by entering the following command: db2 -f Invoice.clp
  5. Test the stored procedures with the supplied Java program testStoredProcedures.java using the test data supplied in Samples folder (see Download).
    1. Modify the program with data needed to make the connection to DB2 for z/OS. This includes values for the host name (IP address or logical name of the z/OS system), port number, and location name of the DB2 subsystem, user id, and password to be used for the connection.
    2. Open a Windows command prompt and navigate to the COBOL stored procedure samples folder.
    3. Compile the program with the command javac testStoredProcedures.java
    4. Run the program with the command java testStoredProcedure <SPName>, where <SPName> is the name of the stored procedure you want to test. Note that values are hard coded in the test program, so only certain test sequences make sense. Refer to the README.txt file (see Download) or the program for details.

Advanced considerations

Data conversions

In the supplied samples, CLOB is used to pass the XML parameters to and from the stored procedures. Sometimes BLOB is preferred for transmitting XML to inhibit data conversions on entry and exit from stored procedures. Data conversions are costly in terms of CPU usage and can cause loss of data if an interim code page cannot represent some of the characters that appear in the XML. This articles uses CLOB in the supplied samples because it is easier to manipulate variables that are character-based in tools.

In order to prevent unnecessary interim data conversions when using CLOB you can specify CCSID UNICODE for MIXED DATA on the stored procedure parameter declaration. By inhibiting the conversion, the COBOL stored procedure programmer has to be aware that the XML variable contains UTF-8 data, whereas other variables in the COBOL stored procedure are likely to be in EBCDIC.

DB2 for z/OS disregards the internal encoding declaration in the XML document header when CCSID UNICODE for MIXED DATA is specified and treats the content as UTF-8. DB2 for z/OS always stores XML in UTF-8.

Figure 3 illustrates what happens when CCSID UNICODE for MIXED DATA is and is not specified:

Figure 3. Interim code page conversions
Diagram showing interim code page conversions using boxes and lines

Filtering data with XPath

It is possible to use XPath to filter data. The XMLEXISTS is a Boolean function that takes an XPath expression as input and returns FALSE if the XPath expression returns an empty sequence and TRUE otherwise. It can be used in the WHERE clause of an SQL statement to select only the rows that return non-trivial XML values for the XPath expression. The syntax of the XPath expression used with XMLEXISTS is identical to the syntax of the XPath expression used with XMLQUERY.

See the stored procedure DELXML for an example (see Download).

A tip: Beware of binary zeros

A binary zeros (null) character is not part of well-formed XML. Take care that you do not introduce binary zeros or any other illegal character in XML, for example, by using the COBOL INITIALIZE verb improperly, not initializing FILLER, or setting the parameter length or null indicator incorrectly for a DB2 stored procedure parameter containing XML.


Summary

This article is part of a series that illustrates how XML can be stored and manipulated with DB2 on z/OS. It has shown how to create and deploy COBOL stored procedures that store and manipulate XML in DB2 for z/OS. The UBL industry format was used as the basis for the sample data. Other industry formats and custom XML documents can be used.


Download

DescriptionNameSize
Sample Cobol stored proceduresCOBOLSPs.zip142KB

Resources

Learn

Get products and technologies

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Select information in your profile (name, country/region, and company) is displayed to the public and will accompany any content you post. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Information management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management, XML
ArticleID=389400
ArticleTitle=Build a pureXML application in DB2 for z/OS, Part 1: Use COBOL stored procedures to store and manipulate data
publish-date=05142009