This is the first installment of a series that discusses how to get started with XML in IBM® DB2® for z/OS®. The series provides different approaches for creating simple applications that manipulate XML data. The objective is to provide some familiarity with the technology and demonstrate the ease with which XML applications can be developed. The downloadable material may also serve as a starting point for prototyping.
With the popularity of SOA and Web-based technologies, there are increasing amounts of XML data coming in and out of computer systems. The DB2 9 for z/OS support for pureXML® makes it much simpler to store and manipulate XML on System z® through SQL/XML interfaces. This article describes how to create COBOL stored procedures in two categories:
- Basic operations: insert, update, and delete XML with DB2 for z/OS
- Complex operations: select XML specifying an XPath, shred XML into relational tables, and compose XML from relational data
Samples for all the stored procedures are provided in an accompanying bundle (see Download). The article also discusses some more advanced aspects, for example:
- Data conversions when transmitting XML data to and from stored procedures in DB2 for z/OS.
- The use of XPath to filter data in XMLEXISTS
In order to install and run the COBOL samples, you need to have DB2 9 for z/OS and IBM Enterprise COBOL for z/OS 4.1.0.
The download bundle, included in the Download section, contains the following:
- COBOL source code
- A list of recommended compiler options
- DDL for eight sample stored procedures
- DDL for tables used by the stored procedures
- XML schema documents
- A script for registering the schema in DB2
- Sample XML documents
- A Java technology program that demonstrates how to call the stored procedures
- A README.txt file with instructions on how to deploy the COBOL stored procedures
Figure 1 illustrates the download bundle content:
Figure 1. Download bundle content
The stored procedures supplied with this article are based on the UBL (Universal Business Language) 2.0 format. (For more information about UBL, see the link to OASIS Universal Business Language Technical Committee Web page in the Resources section.) Industry bundles for other industry formats with DB2 pureXML containing table definitions, schemas, sample documents, and queries are available for download in the article "Build a DB2 pureXML application in a day" (developerWorks, December 2008). You can read about how the samples can be used as the basis for building prototype applications in the "XML Schema Tutorial" (see Resources).
A few basic XML concepts
This article assumes you have some familiarity with XML; but in order to better demonstrate the purpose of each stored procedure, let's establish a few concepts.
Recall that a well-formed XML document is the textual representation of a tree. The root of the tree is the document itself, and each node corresponds to an element or character data. An element can consist of a sequence of elements, in which case the corresponding tree node has child nodes, or character data, in which case the node is a leaf node. An element can have attributes that also get mapped to child nodes; namespaces are a special type of attribute used to qualify names to avoid name clashes between elements of different XML formats.
Listing 1 illustrates an XML document with (simplified) format UBL 2.0:
Listing 1. XML document as shown in a browser
<Invoice xmlns:cac= "urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2" xmlns:cbc= "urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2" xmlns="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2"> <cac:AccountingCustomerParty> <cbc:CustomerAssignedAccountID>XFB01</cbc:CustomerAssignedAccountID> <cac:Party> <cac:PartyName> <cbc:Name>IYT Corporation</cbc:Name> </cac:PartyName> <cac:PostalAddress> <cbc:StreetName>Avon Way</cbc:StreetName> <cbc:PostalZone>ZZ99 1ZZ</cbc:PostalZone> <cac:AddressLine> <cbc:Line>3rd Floor, Room 5</cbc:Line> </cac:AddressLine> <cac:Country> <cbc:IdentificationCode>GB</cbc:IdentificationCode> </cac:Country> </cac:PostalAddress> <cac:Party> </cac:AccountingCustomerParty> </Invoice>
Figure 2 shows the same document presented as a tree:
Figure 2. XML document as a tree
In Figure 2, the element nodes are depicted in blue and include:
The attribute nodes are depicted in red and include:
The character data nodes are depicted in black and include:
- IYT Corporation
- Avon Way
- ZZ99 1ZZ
- 3rd Floor, Room 5
Note that all element names are prefixed with one of the namespaces declared as attributes.
The article samples (see Download) use a subset of the XML format UBL 2.0 (Universal Business Language). UBL 2.0 is a standard for electronic commerce developed by OASIS and is quite comprehensive as real-life data usually is. This article uses the subset related to invoices, which requires 11 XML schema documents. For more information on OASIS and UBL 2.0, refer to the OASIS Universal Business Language (UBL) Technical Committee Web page (see Resources).
The database consists of two tables. The main table is UBLADMIN.UBL, which holds the XML data; the other is a small, specialized table used to hold data shredded from the documents. The second table is primarily for demonstration purposes, whereas the first table, simple as it is, might be sufficient for real-life applications.
Listing 2. Columns of table UBLADMIN.UBL
ID INTEGER GENERATED ALWAYS AS IDENTITY COMMENT VARCHAR(1000) DOCUMENT XML
The DDL for the tables is supplied in Tables.txt (included in the available download).
Sample stored procedures
The stored procedures are written in COBOL. They are kept as simple as possible to better illustrate what is actually needed for working with XML in DB2 for z/OS.
The parameter type used for XML documents in all the stored procedures is CLOB. At present, stored procedures and user-defined functions do not have support for XML as a parameter, but they can easily be passed as CLOBs.
This article supplies three basic operations for XML documents: insert, update, and delete. These are general operations and can be used for most applications, even for XML-based applications on a schema format other than UBL 2.0, because there is no application- or format-specific code in the programs themselves. When an XML schema is used, it is passed as a parameter.
For the operations insert and update, this article provides two flavours of the stored procedures: with and without schema validation.
Schema validation is the process of checking that an XML document adheres to the rules and format defined by an XML schema. These rules describe, for example, which elements are valid, the order of elements within a document, the data types of attributes, whether an element is required or optional, and so on. For more information on XML schema, refer to the "XML Schema tutorial" (see Resources).
When inserting or updating without schema validation, the document must be well-formed.
Validation of the documents requires that the schema be registered to DB2. This article supplies the schema for UBL 2.0 Invoice, which consists of 11 individual schema documents. This article also supplies the script Invoice.clp, which registers the schema SYSXSR.UBL_INVOICE_2_0 as consisting of these documents through the DB2 command line processor.
The XML document samples provided in the Samples folder (see Download) are all valid according to this schema.
The stored procedures without schema validation are INSXML, UPDXML, and DELXML, for insert, update, and delete respectively. The stored procedures using schema validation are INSVLXML and UPDVLXML, for insert and update respectively.
Additionally, this article supplies three stored procedures for more complex operations on XML data. These operations are not generally applicable as they are application- and format-specific, and have to be tailored in each case. They could serve as examples that may be transferred to other areas and uses.
Each of the tailored stored procedures uses a different DB2 XML construct. This article briefly goes through these constructs, but refer to the "DB2 Version 9.1. for z/OS, XML Guide" (IBM Corporation, December 2008) for a detailed description (see Resources).
The stored procedure SELXML queries data inside an XML document using XPath, which is a W3C-recommended standard for accessing data in XML documents and has been incorporated into DB2.
XPath provides primitives for moving around the XML tree, for selecting nodes according to name or by the use of a Boolean expression, and for applying functions to a sequence of nodes. It takes an XML document as input and returns a sequence of nodes or the result of applying a function to a sequence of nodes. For more details on XPath 2.0, see "XML Path Language (XPath) 2.0" (W3C, January 2007).
In DB2 an XPath expression is wrapped in a function (XMLQUERY or XMLEXISTS), which is applied to an XML column and returns something of type XML. The output from the function can then be used as input to another function taking XML arguments, or it can be cast to an equivalent SQL data type. For more details, see "DB2 Version 9.1. for z/OS, XML Guide" (see Resources).
The XPath expression used in SELXML is shown in Listing 3:
Listing 3. XPath expression used in SELXML
declare default element namespace 'urn:oasis:names:specification:ubl:schema:xsd:Invoice-2'; declare namespace cbc= 'urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2'; declare namespace cac= 'urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2'; /Invoice[cac:PaymentMeans/cbc:PaymentDueDate=$PAYMENTDATE] /cac:LegalMonetaryTotal /cbc:PayableAmount
The first part of the expression contains the namespace declarations (in this case, UBL 2.0 Invoice) and some auxiliary namespaces; these are used to ensure uniqueness of all element names used.
The second part of the XPath expression traverses the XML tree through the nodes Invoice and LegalMonetaryTotal to PayableAmount, which is returned if the Boolean expression in the square brackets is true. The Boolean expression says that the PaymentDueDate, which must be a child node of PaymentMeans, which must be child node of Invoice, must equal $PAYMENTDATE, which is a value supplied by a host variable.
Put simply, it returns the payable amount if the payment due date matches the one supplied in the host variable. This XPath expression is used with other SQL in the stored procedure to return the sum of the payable amount due on the date supplied as input.
The stored procedure SHREDXML shreds an XML document to relational data using the DB2 function XMLTABLE, which specifies an XPath to navigate the XML document and returns the result in the form of a table. This could be used for integration with relational data.
XMLTABLE expressions can be used in many different ways. The SHREDXML is built from the following three components:
- Namespace declarations for UBL 2.0 Invoice and auxiliary namespaces.
- A context item, which is an XPath expression used to point to a node that will be the starting point for the column expressions, namely AccountingCustomerParty.
- Column definitions for desired columns of the table. These include name, data type, and an XPath expression defining the column content using the context item as starting point. The definitions for columns include ACCOUNT_ID, NAME, CONT_PERSON, PHONE, and EMAIL.
Doing so creates a relational view on the UBL 2.0 Invoice documents that gives contact details for customers.
The table UBLADMIN.CUSTOMER in the supplied DDL matches the format of the output from this stored procedure.
The stored procedure TAGXML composes an XML document from relational data using XMLELEMENT and other similar DB2 functions known as publishing functions. The publishing functions construct XML values from SQL values or other XML values. Table 1 shows some of the most common publishing functions:
Table 1. Common publishing functions
|XMLDOCUMENT||Returns an XML value with a single document node with zero or more children nodes.|
|XMLNAMESPACES||Constructs namespace declarations from the arguments.|
|XMLELEMENT||Returns an XML value that is an XML element node.|
|XMLATTRIBUTES||Constructs XML attributes from the arguments.|
|XMLCONCAT||Returns a sequence containing the concatenation of a variable number of XML input arguments.|
|XMLFOREST||Returns an XML value that is a sequence of XML element nodes.|
The publishing functions are dual to the XMLTABLE function.
To emphasize this, this articles uses TAGXML to perform the exact opposite function from SHREDXML, (in other words, it composes an XML element with customer contact details from a relational table). The table UBLADMIN.CUSTOMER is the starting point to compose an XML element with the same internal structure as the corresponding subset of an UBL 2.0 invoice.
Deploying the stored procedures
Complete the deployment of the stored procedures in five easy steps:
- Create tables in DB2 by running the DDL supplied in Tables.txt (see Download), using SPUFI, for example.
- Compile, link-edit, and deploy each of the COBOL load modules, and bind the DBRM using whatever procedure you have in place on your system. The COBOL source code for the eight stored procedures is available in the sub-folder COBOL source together with the file Compiler options.txt that lists recommended compiler options (see Download). Please note that options QUOTE and QUOTESQL are required.
- Create the stored procedures by running the DDL supplied in SP.txt
(see Download), using SPUFI, for example. This
file contains the DDL for all of the stored procedures; if you do not
want to implement them all, you may want to remove them from the file
before implementation. Before running, perform the following
- Change the WLM environment to the one that runs COBOL stored procedures at your site. (Current name is DSN9WL4K.)
- Change the collection id to the one you will be using for the stored procedures. (Current name is UBLADMIN.)
- Register the XML schema UBL 2.0 Invoice by running script Invoice.clp
through the DB2 command line processor on your Windows machine:
- Open the DB2 command line processor from your laptop by running db2cmd from the start menu.
- Connect to the DB2 subsystem where you wish to register the XML schema.
- Navigate to the Schema folder.
- Run the commands in the file Invoice.clp by entering the
db2 -f Invoice.clp
- Test the stored procedures with the supplied Java program
testStoredProcedures.java using the test data supplied in Samples
folder (see Download).
- Modify the program with data needed to make the connection to DB2 for z/OS. This includes values for the host name (IP address or logical name of the z/OS system), port number, and location name of the DB2 subsystem, user id, and password to be used for the connection.
- Open a Windows command prompt and navigate to the COBOL stored procedure samples folder.
- Compile the program with the command
- Run the program with the command
java testStoredProcedure <SPName>, where
<SPName>is the name of the stored procedure you want to test. Note that values are hard coded in the test program, so only certain test sequences make sense. Refer to the README.txt file (see Download) or the program for details.
In the supplied samples, CLOB is used to pass the XML parameters to and from the stored procedures. Sometimes BLOB is preferred for transmitting XML to inhibit data conversions on entry and exit from stored procedures. Data conversions are costly in terms of CPU usage and can cause loss of data if an interim code page cannot represent some of the characters that appear in the XML. This articles uses CLOB in the supplied samples because it is easier to manipulate variables that are character-based in tools.
In order to prevent unnecessary interim data conversions when using CLOB you can specify CCSID UNICODE for MIXED DATA on the stored procedure parameter declaration. By inhibiting the conversion, the COBOL stored procedure programmer has to be aware that the XML variable contains UTF-8 data, whereas other variables in the COBOL stored procedure are likely to be in EBCDIC.
DB2 for z/OS disregards the internal encoding declaration in the XML document header when CCSID UNICODE for MIXED DATA is specified and treats the content as UTF-8. DB2 for z/OS always stores XML in UTF-8.
Figure 3 illustrates what happens when CCSID UNICODE for MIXED DATA is and is not specified:
Figure 3. Interim code page conversions
Filtering data with XPath
It is possible to use XPath to filter data. The XMLEXISTS is a Boolean
function that takes an XPath expression as input and returns FALSE if the
XPath expression returns an empty sequence and TRUE otherwise. It can be
used in the
WHERE clause of an SQL statement to
select only the rows that return non-trivial XML values for the XPath
expression. The syntax of the XPath expression used with XMLEXISTS is
identical to the syntax of the XPath expression used with XMLQUERY.
See the stored procedure DELXML for an example (see Download).
A tip: Beware of binary zeros
A binary zeros (null) character is not part of well-formed XML. Take care that you do not introduce binary zeros or any other illegal character in XML, for example, by using the COBOL INITIALIZE verb improperly, not initializing FILLER, or setting the parameter length or null indicator incorrectly for a DB2 stored procedure parameter containing XML.
This article is part of a series that illustrates how XML can be stored and manipulated with DB2 on z/OS. It has shown how to create and deploy COBOL stored procedures that store and manipulate XML in DB2 for z/OS. The UBL industry format was used as the basis for the sample data. Other industry formats and custom XML documents can be used.
|Sample Cobol stored procedures||COBOLSPs.zip||142KB|
- OASIS Universal Business Language (UBL) Technical Committee Web page: Learn more about the Universal Business Language (UBL).
- "XML Path Language (XPath) 2.0" (W3C, January 2007): Learn more about the XPath language.
- "DB2 Version 9.1. for z/OS, XML Guide" (IBM Corporation, December 2008): Find more information about pureXML (DB2 for z/OS support for XML). Learn how to store XML data in DB2 databases and retrieve XML data from DB2 databases.
- Industry Formats and Services with pureXML: In this demonstration, see how XML in particular industry formats can be stored and queried in pureXML databases in a straightforward way without mapping.
- "Build a DB2 pureXML application in a day" (developerWorks, December 2008): Build a complete DB2 application in a day using an end-to-end XML architecture.
- XML Schema tutorial (w3schools.com): Learn how to create XML Schemas, why XML Schemas are more powerful than DTDs, and how to use XML Schema in your application.
- DB2 for z/OS page on developerWorks: Get the resources you need to advance your skills for DB2 for z/OS.
- Information Management and XML page on developerWorks: Get the resources you need to advance your pureXML skills.
- developerWorks Information Management zone: Learn more about Information Management. Find technical documentation, how-to articles, education, downloads, product information, and more.
- Stay current with developerWorks technical events and webcasts.
Get products and technologies
- Download IBM product evaluation versions, or explore the online trials in the IBM SOA Sandbox, and get your hands on application development tools and middleware products from DB2, Lotus®, Rational®, Tivoli®, and WebSphere®.
- Participate in the discussion forum.
- Participate in developerWorks blogs and get involved in the developerWorks community.