XML application migration from DB2 for z/OS V8 to DB2 9, Part 2 : Comparing XML functionality in DB2 9 to DB2 V8

Understand the new XML features introduced in DB2 9 and their impact on migration of existing XML applications

Learn about the XML functionality in DB2 9 for z/OS (DB2 9) and compare it with that of DB2 for z/OS V8. This article discusses the new XML features introduced in DB2 9, and goes into the details regarding the impact the new XML support has on migrating existing XML-based applications. This is the second of a three-article series on migrating your XML applications from DB2 for z/OS V8 to DB2 9 for z/OS.

Note: These three articles were originally written for workstation DB2 by Hardeep Singh. They were then modified to be applicable to DB2 for z/OS by John Shelton.

Share:

John Shelton (johnshel@us.ibm.com), Software engineer, IBM Japan

John Shelton is a member of the DB2 XML Extender team.



Hardeep Singh (hardeep@us.ibm.com), Architect DB2 XML tooling, DB2 XML application migration, IBM

Hardeep Singh photoHardeep Singh is a member of the DB2 Native XML team. He is the architect for DB2 XML tooling and is responsible for XML Extender application migration to Viper. He has more than 21 years of industry experience.



10 April 2008

Also available in Chinese

Introduction

XML support in DB2 for z/OS V8 was based on DB2's relational infrastructure. Prior to the DB2 9 release, XML data was either stored intact as a character large object (CLOB) or shredded to relational tables. In contrast, DB2 9 has true native support for XML data. XML is now handled as a new data type and XML data is stored in a parsed tree that is separate from the relational data storage. XML indexing based on an XML pattern has been introduced, along with support for SQL/XML as the language to query and publish XML data. In order to understand the impact of these new XML features on migration, you need to compare the different techniques for storing and querying XML data in DB2 for z/OS V8, to similar or extended XML functionality available in DB2 9. Based on this comparison, this article identifies potential areas of the application code that may need to be changed.

XML support in DB2 for z/OS V8

DB2 V8 provides XML functionality to the underlying relational data using SQL/XML and the DB2 XML Extender. The DB2 XML Extender provides a number of user-defined types (UDTs), user-defined functions (UDFs), and stored procedures to enable XML support. XML documents are stored in a relational form and the XML Extender infrastructure is used to provide an XML feel to the relational data. A subset of SQL/XML functions are also supported for publishing relational data as XML. DB2 also allows the creation of user-defined Java™ functions and stored procedures that can incorporate XML and XSL parsers (installed as apart of DB2 V8) into the code.

XML support in DB2 9 for z/OS

DB2 9 support for native XML storage removes the need to shred XML into relational tables for querying and publishing purposes. The new storage retains XML in a parsed, annotated tree form that's similar to XML document object model (DOM). For applications that still need to shred XML data into relational tables, enhanced shredding capability using annotated XML schema mapping has been added to this release.

Table 1 compares XML features available in DB2 V8 with those available in DB2 9.

Table 1. XML feature comparison - DB2 V8 and DB2 9
FeatureDB2 for z/OS V8DB2 9 for z/OS

Storage and indexing

XML data is stored in two forms:

Intact document:

  • XML Extender column of type XMLCLOB, XMLVARCHAR, XMLFILE.
  • Column of type BLOB, CLOB or VARCHAR.
  • Indirect indexes on the XML document are created in relational side tables.
  • The document is parsed to update the side tables created.

Shred to relational XML Extenders (XML Collection):

  • Decomposed to existing relational tables using document access definition (DAD) mapping.
  • Only leaf nodes in the incoming XML can be shredded and mapped to SQL column types.
  • Namespaces are not supported.
  • Limited capabilities prevent it from handling complex XML documents.

XML data is stored in two forms:

Intact document-Native XML store.

  • A totally new hierarchical (native) storage built from the ground up. The storage retains the arbitrary tree form of the document.
  • Column of type XML.
  • XML indexes using path expressions on leaf nodes.
  • The document is parsed at insert.

Shred to relational SQL/XML functions and annotated schema mapping.

  • Decomposed to existing relational tables using annotated XML schema mapping.
  • XML fragments can also be shred allowing mapping to XML column type.
  • Handles namespaces.
  • Extended capabilities, like DB2 expression and conditions, can be used to filter the XML data.

Validation

  • Implicit validation defined in the DAD.
  • Explicit validation using XML Extender UDF svalidate.
  • Document type definitions (DTDs) registered and stored in internal tables.
  • XML schemas picked from the file system.
  • No implicit validation is associated with a column.
  • Explicit validation using the SQL/XML function DSN_XMLVALIDATE.
  • No DTD validation supported.
  • XML schemas are registered in the XML schema repository (XSR), and stored in the database.

Query and publish

Querying and publishing is done using SQL/XML and XML Extender functions.

For intact XML documents.

  • Indexed side tables are used to query the document that is retrieved as a CLOB.
  • Subdocument queries can be made using XML Extender functions.

XSLT transformations can be made using Extender functions.

For XML data shredded to relational.

  • SQL/XML publishing functions.
  • XML Extender DAD mapping and functions.

Querying and publishing is done using SQL/XML.

For intact XML documents.

  • SQL/XML can be used to retrieve and publish relational and XML data.
  • Joins across multiple XML columns are possible.

XSLT support using XML Extender XSLT functions.

For XML data shredded to relational.

  • SQL/XML publishing functions.

Updates

For intact XML documents.

  • Full document update using the SQL update statement.
  • Updated using XML Extender UDF update.

For intact XML documents.

  • Full document update using the SQL update statement.
  • No support for subdocument updates.

Enable database for XML

Need to enable XML Extender functionality for the database.

No need to enable the database, since XML support is now an intrinsic part of the DB2 engine.

Tools

No integrated support for XML in the DB2 Development Center or Control Center.

WebSphere Studio Application Developer

  • XML Schema Editor
  • DTD Editor
  • DAD Mapping tool
  • No support for SQL XML in the query builder.
  • Xerces and Xalan parsers
  • No support in the Java Database Connectivity (JDBC) driver.

XML support integrated into all aspects of the tools.

DB2 Control Center and DB2 Developers Workbench (Eclipse based).

  • XML Editor
  • XML Schema Editor
  • Mapping tool for annotated XML schema mapping
  • Support for SQL/XML functions in query builder
  • XQuery builder - builds full XQuery expressions which my be incompatible with z/OS DB2
  • XML Index builder
  • Xerces and Xalan parsers
  • Support in JDBC driver
  • Visual Studio .NET- added support for DB2 XML

DB2 version

DB2 V7 and DB2 V8

DB2 z/OS V7 and later (V8)

iSeries V5R1 and later (V5R3)

DB2 9

AIX, NT, LINUX, SOLARIS, DB2 9 for z/OS

XML features introduced in DB2 9

Before you dive into a step-by-step migration of a sample application, take a brief look at DB2 9 for z/OS XML support.

Storage of XML data

While interacting with DB2's native XML support, the abstraction shown in Figure 1 is a column of type XML in a relational table. Any well-formed XML statement can be inserted into that column. In the physical storage layer though, the primary storage unit is a record of nodes.

For tables that have a mix of relational and XML columns, the relational columns are stored in relational format and XML columns are stored as a parsed tree.

Figure 1. Interacting with DB2's native XML support
Interacting with DB2's native XML support

XML index

The XML index is a new index that can provide efficient evaluation of XML path expressions to improve performance during queries on XML documents.

The XML index uses an XMLPATTERN (an XPath without predicates, only a child axis (/) and a descendent-or-self axis (//) ) to index paths and values in XML documents stored in a single XML column. When creating an index, its possible to specify what paths to index and what type. Any nodes that match the path expression, or the set of path expressions, in XML stored in that column are indexed. The index points directly to the node in storage that is linked to its parent and children for fast navigation.

create index idx3 on dept(deptdoc) generate key using
xmlpattern '/dept/employee/name' as sql varchar(35);

XML schema repository

The XSR is a repository for all XML artifacts used to validate and process XML instance documents stored in XML columns. The XSR removes the additional overhead required to locate external documents, along with the possible performance impact. Each DB2 database system contains an XML schema repository that resides in tables and some system defined stored procedures (for registering) to enter data into these tables. XML schemas, DTDs and external entities must be explicitly registered in the XSR before use.

Annotated schema mapping

DB2 9 provides a new shred mechanism using annotated XML schemas. Instead of creating a separate mapping file (DAD) the XML schemas are annotated with the relational mapping information. These annotated schemas are then registered to the XSR. They are used at runtime to decompose XML documents being inserted to the column. Data types for decomposed XML values are standard relational types. The shredded relational columns can be indexed using relational indexing. For more information, refer to "From DAD to annotated XML schema decomposition" (developerWorks, April 2006).

Figure 2 illustrates an example of an annotated schema mapping. It shows how an entire XML document can be stored in the same record along with data that has been extracted from the XML document.

Figure 2. Annotated schema mapping
Annotated schema mapping

The intact XML document is stored in an XMLPO column (type XML) along with the extracted orderID, extracted to the column ORDERID (VARCHAR), and orderDate, extracted to the column ORDERDATE (DATE).

SQL/XML functions

XML publishing functions

Document construction is the process of composing an XML value out of its small parts like elements and attributes. SQL/XML provides several functions for this process that enable construction of XML documents from relational data.

The following SQL/XML functions for publishing existed in DB2 V8

  • XMLAGG: Aggregates XML values provided as input and returns the concatenation of all non-null input XML values.
  • XMLATTRIBUTES: Generates attributes for the XML element being constructed.
  • XMLCONCAT: Concatenates non-null input expressions of type XML to return a sequence of XML values
  • XMLELEMENT: Constructs an XML element based on input arguments.
  • XMLFOREST: Constructs a forest of XML elements from input arguments.
  • XMLNAMESPACES: Constructs namespace declarations for elements generated by XMLELEMENT or XMLFOREST.
  • XML2CLOB and XMLSERIALIZE: XML serialization functions convert the XML value into a string value.
Figure 3. SQL/XML publishing functions in DB2 V8
SQL/XML publishing functions in DB2 V8

DB2 9 supports all the SQL/XML function in DB2 V8 (plus accepting or returning a new XML type).

The following new SQL/XML publishing functions are also supported:

  • XMLCOMMENT: Generates an XML comment node.
  • XMLPI: Generates an XML processing instruction node.
  • XMLTEXT: Generates an XML text node.
  • XMLDOCUMENT: Generates an XML document node.

The following are new SQL/XML functions for IN/OUT and validation:

  • XMLPARSE: Parses character/binary data, and produces XML value.
  • DSN_XMLVALIDATE: Validates the XML value against an XML schema and type, and annotates the XML value.
  • IS VALIDATED: Checks if the XML value has been validated.

XMLPARSE

A string value that represents an XML document can be transformed to a DB2 XML type using the SQL/XML function XMLPARSE. The function takes a serialized XML document and converts it into an instance of the XML type. XMLPARSE is used when the default parsing options need to be changed.

insert into customer (cid, info) values (?,XMLPARSE(
DOCUMENT cast(? as BLOB(2M)) preserve whitespace ))

DSN_XMLValidate

The DSN_XMLValidate function takes an XML value and validates it against an XML schema, returning an XML value augmented with the default values and type annotations that are defined in the XML schema. There is no column level XML schema validation for a native XML column. Validation has to be explicitly added to each insert or update statement. This concept allows you to save XML documents from different XML schemas in the same column.

Note: Validation against DTDs is not supported.

insert into purchaseOrders (pid,porders)
values (2, XMLPARSE(DOCUMENT DSN_XMLVALIDATE(?, 'SYSXSR.PURORD')))

XMLSERIALIZE

XML values are processed in an internal representation that is not a string and not directly comparable to string values. An XML value can be transformed into a serialized string value representing the XML document using the XMLSERIALIZE function. This function is quite useful in environments where the DB2 driver does not support XML type.

The XMLSERIALIZE function returns a serialized XML value of the specified data type generated from the XML expression argument. The result has the data type specified by the user. If the result of the XML expression is null, the result is the null value. The serialized XML value that is returned must be smaller than 2GB. The serialized result is encoded with UTF-8:

Listing 1. Serialized result
SELECT XMLSERIALIZE ( CONTENT
XMLELEMENT ( NAME "customer",
XMLATTRIBUTES (e.cid AS "id"),
XMLELEMENT (NAME "name", e.name),
XMLELEMENT (NAME "email", e.email),
XMLELEMENT (NAME "zip", e.zip))
AS CLOB(7200)) AS CINFO FROM customer e where cid=1000

The following are new SQL/XML functions and predicates for querying XML data:

  • XMLEXISTS: Determines if an XPath returns a sequence of one or more items.
  • XMLQUERY: Executes an XPath, and returns the result sequence.

XMLQUERY

XMLQUERY is a SQL/XML publishing (pseudo) function whose purpose is to evaluate an XPath expression and return an XML value, that might be an XPath sequence of zero or more items, to the application.

Figure 4. XMLQUERY query
XMLQUERY query

The query shown in Figure 4 returns only some elements of the XML data stored in the DESCRIPTION column for every row in the XMLPRODUCT table.

XMLEXISTS

This XMLEXISTS predicate queries a SQL/XML value using an XPath expression. If the query returns a sequence that is not empty, the function returns the true condition, otherwise it returns false.

Figure 5. XMLEXISTS query
XMLQUERY query

Impact of DB2 9 XML support on XML based applications

An application’s design is influenced by the storage model for the XML data (intact or shredded). The storage model is influenced by functionality limitations on storing, querying, and modifying the XML data and any performance requirements. Therefore, with the introduction of new functionality, the XML storage model can change since many of the limitations of the DB2 XML Extenders have been overcome.

The rest of the article describes, in detail, the difference in DB2s capabilities of handling XML data. How to migrate existing applications to DB2 9 in order to use the new feature set is also reviewed.

Set up the database

Right up front you may notice that some of the steps needed for setting up a database for XML support have changed.

Table 3. Database set up differences between DB2 V8 and DB2 9 using new XML features
FeatureDB2 V8DB2 9

Enable server for XML

XML Extender support needs to be enabled per server by giving the dxxadm enable_server command.

XML type support is intrinsic to DB2

Install any DTD or XML schema needed for validation.

All XML schemas used for validation are located on the file system and not registered.

Any DTDs used for validation need to be registered and stored in a special DTD_REF table in the database.

Any XML schemas used for validation needs to be registered in the XSR.

DTDs cannot be used for validation so in most cases the DTD has to be changed to an XML schema.

Create XML to relational mapping

XML to relational mapping is required for both intact storage (XML stored in a single column as an XML CLOB) and decomposed storage [data inside the XML shredded to relational tables]).

A proprietary form of mapping called DAD is used.

XML to relational mapping is only required for decomposed storage. A more standard and efficient technique of mapping called Annotated XML Schema Mapping is used.

Register and enable the mapping

XML columns that store intact XML need to be enabled using the DAD mapping in order to create the triggers necessary to update the side tables associated with the column. The command for enabling the XML column is: dxxadm enable_column.

DAD mapping, which defines the decomposition of an XML document to a collection of relational tables (XML collection), can be registered and enabled for use in an XML Extender shredding function. The command to enable the XML Collection is: dxxadm enable_collection.

Columns of type XML, that store native XML data, do not have any mappings associated with them and do not need to be enabled for storage.

Annotated XML schemas that define the decomposition of an XML document to relational tables need to be registered in the XSR and enabled for decomposition. This can be done either at registration time:

COMPLETE XML SCHEMA PRODSCHEMA ENABLE DECOMPSITION

Or enabled after the XML schema has been registered:

ALTER XSROBJECT PRODSCHEMA ENABLE DECOMPOSITION

Create indexes on the XML documents

For intact documents in XML columns, indirect indexing can be done by shredding parts of the data from the XML document into secondary (side) relational tables, and then indexing these tables to do fast searches on the shredded data.

Note: These mappings, in effect, define sort of a schema on the XML column, and if the shape of the incoming document changes, the mapping information and the side tables need to change.

XML documents stored natively in XML columns can be indexed using XML indexes. These indexes directly locate the stored node in the parsed tree instead of the document. This XML index creation is a new concept that has been introduced in DB2 9

Note: Since indexes are not tied to any schemas, or mapping if the schema changes new indexes, it can be defined without having to drop existing indexes.

In order to understand the migration process, first look at the some basic steps for creating a database for storing XML data in DB2 V8 and DB2 9.

  1. Register DTD/XML schemas

    DB2 V8. Only DTDs can be registered.

    insert into db2xml.dtd_ref (dtdid, content, usage_count, author,
    creator, updator) values ( 'customerdtd', db2xml.XMLCLOBFromFile(
    '/home2/hardeep/migrate/customer.dtd'), 0, 'xml', 'xml', 'xml');

    DB2 9. Only XML schemas are used for validation:

    Listing 2. Registration of XML schemas for validation
    CALL SYSPROC.XSR_REGISTER(
      ‘SYSXSR’,
      ‘POschema’,
      ‘http://myPOschema/PO.xsd’,
      :content_host_var,
      :docproperty_hos_var)
    
    CALL SYSPROC.XSR_COMPLETE(
      ‘SYSXSR’,
      ‘POschema’,
      :schemaproperty_host_var,
      0)
  2. Create a table with a column for storing XML data.

    DB2 V8. Using XML Extender data type XMLCLOB.

    CREATE TABLE PurchaseOrder ( POid INTEGER NOT NULL PRIMARY KEY,
    Info DB2XML.XMLCLOB)

    Note: Column can be of type XMLCLOB, XMLVARCHAR, and XML FILE.

    DB2 9. Column of type XML.

    CREATE TABLE PurchaseOrder ( POid INTEGER NOT NULL PRIMARY KEY,
    Info XML)

    Note: The XML type is a standard SQL type unlike XMLCLOB, XMLVARCHAR, and XML FILE.

  3. Create indexes on the XML data.

    DB2 Version 8. Create DAD mapping and enables the XML column to create indirect indexes on the side tables.

    dxxadm enable_column –a V81A purchaseorder info
    "/home2/hardeep/migrate/po.dad" -r poid

    DB2 9. Create XML indexes using an xmlpattern.

    create unique index order_key on purchaseorder (info) generate key
    using xmlpattern '/purchaseOrder/@poid' as sql double;
  4. Create Mappings for shredding incoming XML to relational

    DB2 V8. Create DAD mapping and enable the XML collection.

    dxxadm enable_collection –a V81A customerinfo
    /home2/hardeep/migrate/customer.dad

    DB2 9. Register the annotated XML schema and enable it for decomposition:

    Listing 3. Register the annotated XML schema and enable it for decomposition
    CALL SYSPROC.XSR_REGISTER(
      ‘SYSXSR’,
      ‘POschema’,
      ‘http://myPOschema/PO.xsd’,
      :content_host_var,
      :docproperty_hos_var)
    
    CALL SYSPROC.XSR_COMPLETE(
      ‘SYSXSR’,
      ‘POschema’,
      :schemaproperty_host_var,
      1)

Migrating an existing XML database from DB2 V8 to DB2 9

Now that you have some idea about the difference in the way the database is setup for XML support you can start looking at the basic steps needed to migrate an existing XML database from DB2 V8 to DB2 9.

First, you need to unload all of your tables containing XML data and load the data into new tables with XML data type columns. This is a manual operation. This article only discusses the migration process of DB2 objects that relate to XML data and leave the rest for you to look up in the DB2 migration documentation.

Migrating DTDs and XML schemas

Since DB2 9 does not use DTDs for XML validation it is necessary to convert any DTDs used by the current XML application to XML schemas. In order to do this you first need to load the DTDs from your existing DB2 V8 database (DTD_REF table) to the local file system.

select db2xml.XMLFileFromCLOB(db2xml.clob(content),
'/home2/hardeep/migrate/customer.dtd') from db2xml.dtd_ref where DTDID='customerdtd';

Note: Since XML schemas are not saved in the database, you can get them from the file system.

Once you have copied the DTD from the DTD_REF table to your file system, you need to convert it to an XML schema. There are a number of free Web-based utilities for XML that can help you with the conversions between DTDS and XML schemas (search using the keywords: DTD Schema convert).

Register the XML schemas

In order to use the XML schemas to validate XML documents, you first need to register them in the XML schema repository:

Listing 4. Register XML schemas in the XML schema repository
CALL SYSPROC.XSR_REGISTER(
  ‘SYSXSR’,
  ‘POschema’,
  ‘http://myPOschema/PO.xsd’,
  :content_host_var,
  :docproperty_hos_var)

CALL SYSPROC.XSR_COMPLETE(
  ‘SYSXSR’,
  ‘POschema’,
  :schemaproperty_host_var,
  1)

Migrate tables with XML data to the new database

In DB2 V8, for storing intact XML documents, the column could have been of type CLOB, BLOB, VARCHAR or XML Extender type DB2XML.XMLCLOB, DB2XML.XMLVARCHAR, or DB2XML.XMLFILE. For DB2 9 the column type is XML.

Therefore, a table that was created (in DB2 V8) with a CLOB column for storing XML data:

CREATE TABLE Product(Pid VARCHAR(10) NOT NULL PRIMARY KEY,
Name VARCHAR(128), Category VARCHAR(32),Price DECIMAL(30,2), Info CLOB,
row_id rowid not null generated always);

Is now created with an XML column for storing XML data:

CREATE TABLE Product(Pid VARCHAR(10) NOT NULL PRIMARY KEY,
Name VARCHAR(128), Category VARCHAR(32),
Price DECIMAL(30,2), Info XML);

Note: In the above table, if the data in the NAME, CATEGORY, and PRICE columns was extracted from the XML document, for creating relational indexes, then the new table does not need these columns at all since now you can index into the stored XML document itself. The schema for the table could be simplified to only two columns.

CREATE TABLE Product(Pid VARCHAR(10) NOT NULL PRIMARY KEY, Info XML);

A table that was created (in DB2 V8) with a DB2XML.XMLCLOB column for storing XML data:

CREATE TABLE PurchaseOrder(Poid BIGINT NOT NULL PRIMARY KEY,
Status VARCHAR(10) NOT NULL WITH DEFAULT 'New',
Info DB2XML.XMLCLOB NOT NULL, row_id rowid not null generated always);

Is now created with an XML column for storing XML data:

CREATE TABLE PurchaseOrder(Poid BIGINT NOT NULL PRIMARY KEY,
Status VARCHAR(10) NOT NULL WITH DEFAULT 'New', Info XML NOT NULL);

Import XML data from DB2 V8 to DB2 9

Once you have created all your tables in the new database, the next step is to export data out of the old database (DB2 V8) tables and import it into the new database (DB2 9) tables. In order to better understand this process, look at how the LOAD utility for DB2 9 handles XML data.

The DB2 LOAD utility provides a simple way for you to populate your XML columns with data from XML files. You can use the LOAD utility to load XML data. The steps for loading XML data are similar to the steps for loading other types of data, except that you need to also perform the following actions:

In the input data set:

- If the data set is in delimited format, ensure that the XML input fields follow the standard LOAD utility delimited format.

- If the data set is not in delimited format, specify the XML input fields similar to the way that you specify VARCHAR input. Specify the length of the field in a 2-byte binary field that precedes the data.

In the LOAD statement:

- Specify the keyword XML for all input fields of type XML.

- If you want the whitespace to be preserved in the XML data, also specify the keywords PRESERVE WHITESPACE. By default, LOAD strips the whitespace.

UNLOAD XML data from DB2 V8

So for XML data stored in columns of type CLOB, BLOB, VARCHAR, DB2XML.XMLCLOB, DB2XML.XMLVARCHAR, or DB2XML.XMLFILE, the first step is to UNLOAD the tablespace:

Listing 5. UNLOAD the tablespace
//UNLOAD    JOB ......
//*
//JOBLIB   DD  DISP=SHR,DSN=DB2A.SDSNLOAD
//*
//PRODUNLD EXEC DSNUPROC,UID='PRODUNLD',UTPROC='',SYSTEM='V81A'
//SYSREC   DD DSN=JRS.PRODUNLD.SYSREC,DISP=OLD
//SYSPUNCH DD DSN=JRS.PRODUNLD.SYSPUNCH,DISP=OLD,
//SYSPRINT DD SYSOUT=*
//SYSIN    DD *
  TEMPLATE LOBFRV DSN 'JRS.PRODUNLD.RESUME'
                  DSNTYPE(PDS) UNIT(SYSDA)
  UNLOAD TABLESPACE JRS.PRODTS NOPAD
    FROM TABLE JRS.PRODUCT
      (PID      VARCHAR(10),
       NAME     VARCHAR(128),
       CATEGORY VARCHAR(32),
       PRICE    DECIMAL(30,2,),
       INFO     CLOB TRUNCATE)
//*

Note: If you were going to drop the name, category, and price columns from your new table, you would modify the UNLOAD TABLESPACE to only select the pid and info columns.

LOAD XML data into DB2 9

The unloaded data can now be loaded to the table created in the new database using the LOAD utility:

Listing 6. LOAD unloaded data to the created table
//LOADPROD EXEC DSNUPROC,UID='PRODLOAD',UTPROC='',SYSTEM='V91A'
//SYSREC   DD DSN=JRS.PRODUNLD.SYSREC,DISP=OLD
//SYSPRINT DD SYSOUT=*
//SYSIN    DD *
LOAD DATA INDDN SYSREC   LOG NO  RESUME YES
 EBCDIC  CCSID(00037,00000,00000)
 INTO TABLE "JRS"."PRODUCT"
 WHEN(00001:00002) = X'0009'
 IGNOREFIELDS YES
 ( "PID"
        POSITION(  00003:0014) VARCHAR
 , "NAME"
        POSITION(  00003:0135) VARCHAR
 , "CATEGORY"
        POSITION(  00136:0172) VARCHAR
 , "PRICE"
        POSITION(  00173:0188) DECIMAL(30,2)
 , "INFO"
        POSITION(  00189:00445) XML PRESERVE WHITESPACE
 )

Note: For XML data shredded to relational columns there is no need to migrate the data.

Migrate indexes on the XML data

In DB2 V8 XML, data stored in XML columns can have indirect indexes using relational side tables. The mapping information that associated values in the side with documents in the XML column is defined in the DAD files.

With DB2 9's ability to create XML indexes over paths inside XML documents, the need for creating indirect relational indexes using side tables is over. Some of the key parameters that are used to create the XML index are similar to the parameters in the DAD mapping. The values from these parameters (in DAD mapping) can be used in creating the XML indexes. Analyze an example of a DAD mapping used to create side table for a purchase order column of type DB2XML.XMLCLOB:

Listing 7. Example of DAD mapping
<DAD>
 <Xcolumn >
<table name="Shipto">
<column name="name" type="varchar(128)"
path="/purchaseOrder/shipTo/name" multi_occurrence="NO"/>
</table>

The parameters of interest in the mapping are: //column/@path and //column/@type

The comparative XML indexes in DB2 9 are:

create index order_name on purchaseorder (info) generate key using
xmlpattern '/purchaseOrder/ shipTo/name' as SQL VARCHAR(128);

The //column/@path value in the DAD is the same as the xmlpattern in the XML index and the //column/@type in the DAD is the same as the SQL type.

Also of interest is the root ID that is passed to the enable_column call in DB2 V8 when an XML column is enabled with the DAD file.

dxxadm enable_column -a V81A migrate purchaseorder info
"/home2/hardeep/migrate/po.dad" -r poid

This root ID identifies the name of the single primary key in the table that is added to the side tables and is used to tie the side tables to the table with the XML column. This value of the root ID is unique across the XML column. This root ID corresponds to the path value in the XML document that uniquely defines that document in the XML column.

So when you create the XML indexes for DB2 9, you can create a unique index on the path that defined the root ID (primary key), in DB2 V8, if it exists in the XML document.

create unique index order_key on purchaseorder (info) generate key using
xmlpattern '/purchaseOrder/@poid' as sql double;

Migrate mapping files for shredding XML to relational

If your application needs to shred incoming XML data to relational tables, and you feel that this need still exists, then you need to replace your existing DAD based mapping with annotated XML schema based mappings.

Table 4. Difference between XML collection and annotated XML schema
AreaXML collectionNews shred

Mapping files

DAD

Annotated schema

Functionality and performance

Limited

Extended functionality and better performance

DTD/schema

Not required. It May or may not be registered.

Required. It has to be registered.

Note: In both type of mappings the relational tables must pre-exist.

Since mapping information about relationships between the XML document and the relational schema is stored in the DAD files, these DAD files can be used to understand the annotations required for the XML schemas. For a tool that can automate the process of converting your DAD files to an annotated XML schema, refer to "From DAD to annotated XML schema decomposition."

Once the XML schemas have been annotated, they can be registered to the XSR and enabled for decomposition:

Listing 8. Register XML schemas to XSR
CALL SYSPROC.XSR_REGISTER(
  ‘SYSXSR’,
  ‘POschema’,
  ‘http://myPOschema/PO.xsd’,
  :content_host_var,
  :docproperty_hos_var)

CALL SYSPROC.XSR_COMPLETE(
  ‘SYSXSR’,
  ‘POschema’,
  :schemaproperty_host_var,
  1)

Migrate DB2 queries

DB2 9 can query both relational and XML data simultaneously using SQL/XML.

As some of the original reasons for shredding the XML documents (fast search using relational columns) are no longer valid, the relational schema needed to store the XML data can become much simpler. As a consequence, a part of the application code that dealt with manipulating the XML data can now be replaced by a DB2 query that could provide the same functionality inside the database server using SQL/XML. The impact of this on applications built on top of XML data is phenomenal:

  • Business logic embedded in SQL queries becomes independent of the application environment. Also, business logic written using SQL/XML can be saved in views or stored procedures for reuse.
  • Middle tiers become leaner and easier to manage.
  • Simplifies service oriented architecture (SOA) with the database becoming a data and service provider.
  • Data is moved out of the database only after it has been fully massaged reducing traffic between database and application.

Some of the application queries that are affected are shown in Table 5:

Table 5. Affected application queries
Application queryDB2 Version 8DB2 9

Insert

Insert calls to store XML documents in DB2XML columns were similar to insert calls for storing CLOBs:

insert into customer (info) values (?)

XML documents that needed to be shredded to relational tables used XML Extender shred functions:

dxxInsertXML,dxxShredXML

The insert calls remain the same unless the documents need to be validated. XML schema validation call needs to be explicitly added to the insert statement.

insert into customer (info) values (XMLPARSE(DOCUMENT DSN_XMLValidate(?,?)))

XML documents that need to be shredded to relational tables use either the annotated XML schema decompose function.

Update

For XML documents that were stored intact (such as a column of type XMLClob), the subdocument update could be done using the XML Extender UDF db2xml.update.

The UDF also synced up the update with the corresponding side tables.

For XML documents that are stored intact (such as a column of type XML) no subdocument update function is available out of the box.

You can find a Java stored procedure for doing subdocument updates in the first article in this migration series, "Partial updates to XML documents in DB2 9 for z/OS."

DB2XMLFUNCTIONS.XMLUPDATE

Query

For an XML document stored in XML Extender columns with side tables, there are a number of XML Extender extraction functions can be used to get subdocument data.

For an XML document stored in XML columns, SQL/XML functions can be used to query into the XML documents. There are a number of SQL/XML functions that support XPath:

  • XMLQUERY
  • XMLEXISTS

Publish

An XML document stored intact could be transformed using XSLT UDF or serialized using XMLSERALIZE, XML2CLOB.

Shredded XML data could be published as XML using either DAD mapping in conjunction with XML Extender publish UDFs, or SQL/XML publishing functions.

An XML document stored intact in XML columns can be published and transformed using SQL/XML or serialized to a SQL string using XMLSERIALIZE

For shredded XML data you need to use SQL/XML publishing functions in the query.

Inserting XML data to DB2

Intact XML stored in a single columns

If XML data was stored intact in the existing application, then there should not be a major impact on the application code and the DB2 queries dealing with the XML data insertion.

DB2 V8 Storing data into a CLOB, VARCHAR, XMLCLOB, or XMLVARCHAR:

insert into customer (info) values (?)

DB2 9. Column of type XML:

The code would remain the same unless you needed to parse or validate the data.

insert into customer (info) values (?)

In case you need to validate the data against a registered XML schema, you would add the DSN_XMLVALIDATE to the insert statement.

insert into customer (info) values (DSN_XMLValidate(?,?))

Note: In XML Extender, validation can be stated implicitly in the DAD mapping for the side tables.

If you need to override the default parsing of the XML document, you can specify the parsing options using the XMLPARSE in the insert statement.

insert into customer (info) values (XMLPARSE(DOCUMENT
cast(? as BLOB(2M)) preserve whitespace ))

Note: The XML data that was stored intact could also have been shredded and mapped to relational tables for querying purposes. This could have been accomplished using DAD mapping for XML column side tables, or it could have been done in the application code. Since, in DB2 9, you can directly index and search into nodes inside the XML document, there is no need to shred the XML to relational for the sole purpose of querying. Application code dealing with the shredding could be redundant.

XML shredded to relational tables

For XML documents where the need to shred exists, the queries and application code dealing with the data insertion and shredding changes slightly.

DB2 V8. XML documents shred using XML Extender shred functions dxxInsertXML and dxxShredXML.

Call dxxShredXML (?, ?,?,? );
Call dxxInsertXML(customerdad, ?,?,? );

DB2 9. XML documents shred using annotated XML schema mapping.

Call xdbDecompXML (NULL,migrate.customer,?,?,1,NULL,NULL,NULL)

If the shredding is being done inside the application using a DOM, you don't need to change anything unless you want to migrate the shredding to annotated XML schemas.

Update XML documents stored in DB2

DB2 V8 XML Extender provides a UDF to allow partial updates to XML documents in DB2. Update takes a location path to specify the element or attribute that is to be updated. It returns a data type similar to the column being updated. The content of every element or attribute in the XML document that matches the location path is updated with the supplied value.

Update purchaseorder set status='shipped', info=db2xml.update(
info,'/purchaseOrder/@status','shipped') where poid=2001;

Since the Update UDF handles one location path at a time, in order to do more extensive changes to the document, it needs to be retrieved to the client application, modified, and then inserted back to the database using a SQL update statement.

Note: The update trigger on XML Extender columns also updates the corresponding side tables, in order to keep the data in the XML column in sync with the data in the side tables.

DB2 9 does not have any functions to do subelement updates to an XML document inside the database. DB2 did not want to introduce short term proprietary functions into the language to support this.

A workaround to this problem is a Java stored procedure that can be found in the article "Partial updates to XML documents in DB2 9 for z/OS." This stored procedure can do partial updates to XML documents inside DB2. The update stored procedure is also capable of deleting, replacing, and appending nodes in the XML document.

Note: Under the covers, the stored procedure does a full document update.

Listing 9. Document update
Call DB2XMLFUNCTIONS.XMLUPDATE (
'<updates><update action="replace" col="1"
path="/purchaseOrder/@status">shipped </update></updates>',
'Select info from purchaseorder where poid=2001',
'update purchaseorder set info=? where poid=2001',?,?);

Note: Since the source code is also published, the stored procedure can be used as is or as a model for writing your own Java-based stored procedures to manipulate XML data inside DB2.

Query and publish XML documents stored in DB2

Application code dealing with querying and publishing is the one that is affected the most. For XML data stored in XML Extender columns, the SQL queries change completely, since they involve the side tables, which are no longer maintained.

The following SQL queries two tables to return the details for products that have thumbnail images:

 Select SQLPRODUCT.DETAILS, from SQLIMAGES,SQLPRODUCT
 where SQLPRODUCT.PID=SQLIMAGES.PID and SQLIMAGES.TYPE='thumbnail'

This could be replaced with a single query over a single column:

 SELECT XMLQUERY('/product/description/images/image/@type = "thumbnail"'
PASSING DESCRIPTION)
FROM XMLPRODUCT

Extract and query XML data inside the XML document

DB2 XML Extender extract UDFs provide a number of scalar and table function to extract data from inside an XML document. Any SQL statement involving extraction of data from an XML document using the DB2 extract UDFs also have to be replaced with equivalent calls to XMLEXIST or XMLQUERY.

select db2xml.extractVarchar(info,'/customerinfo/name') from xmlcustomer

Can be replace with:

Select XMLQuery('/customerinfo/name'
passing info) from xmlcustomer xmlcustomer

Although, this is not exactly the same since XMLQuery will return an XML sequence.

select info from customers where db2xml.extractvarchar(info,'/customerinfo/name')=?

Can be replaced with:

select info from xmlcustomer where XMLExists('
$cinfo/customerinfo[name=$name]' passing info as "cinfo",
cast(? as varchar(32)) as "name")

DB2 XML Extenders provide ExtractCLOB () and ExtractCLOBs () UDFs to extract fragments of an XML document.

select db2xml.extractclob(info,'/customerinfo/addr') as RESULT from
customer where cid = ?

Can be replaced with the XML decomposition stored procedure (XDBDECOMPXML).

Transform and publish XML data

Besides the extraction UDF that can be used to get to subdocument information, DB2 XML extenders also provide an XSLT UDF to transform XML documents inside the database.

The two XSL UDFs defined in DB2 XML Extenders are XSLTransformToClob() and XSLTransformToFile().

SELECT XSLTransformToClob( info, /temp/xslfilename.xsl) FROM xslcustomers
SELECT XSLTransformToClob( info, ?) FROM xslcustomers
Pass the XSL as a CLOB at runtime

Although DB2 9 has the XSL UDF as a part of the XML extender support, there is not any built-in function for XSL transformation (This means that you need to enable the database for XML extenders in order to use the XSLT UDFs.).

Transform and publish relational data as XML

DB2 XML Extender DAD files and stored procedures for collections provide bidirectional mapping. The same mapping can be used for shredding and publishing. Unfortunately, the annotated XML schema functionality does not provide any publishing capabilities. So the only way to publish is to use SQL/XML function.


Considerations

This section contains some of the things you need to consider while migrating. These could be limitations or best practices that could help you in making your decisions when you migrate your XML application. For a comprehensive list, refer to the documentation for DB2 9.

Restriction on XML columns

The following are restrictions on XML columns:

  • An XML column cannot be included as columns of keys, including primary, foreign, and unique keys, sequence keys of clustered index, and distribution or partitioning keys.
  • An XML column cannot be part of any index except an index over XML data.
  • An XML column cannot have a default value specified by the WITH DEFAULT clause; if the column is nullable, the default for the column is NULL.
  • An XML column cannot be used in a materialized query table.
  • An XML column cannot be referenced in CHECK constraints.
  • An XML column cannot be referenced in generated columns.
  • An XML column cannot be referenced in triggers.

XML schemas

The following are considerations regarding XML schemas:

  • Since no XML schema is associated with an XML column, it is technically possible to insert XML documents from diverse schemas in a single column (schema chaos).

Indexes

An XML index can be used for a query predicate only if:

  • The index contains the query predicate. For example, if the index is equally or less restrictive than the predicate. This includes namespaces.
  • The query predicate matches the index data type.
  • /text() is used consistently in predicates and index definitions.

Wherever possible, use fully specified XPaths. Avoid wildcards (*, //), avoid /text().

  • /customerinfo/phone instead of //phone
  • /customerinfo/addr/state instead of /customerinfo/*/state
  • /customerinfo/name instead of /customerinfo/name/text()

Annotated XML schemas

  • Annotations do not contribute to the validation of the instance document.
  • Most of the XML schema components can have annotations.
  • Decomposition only looks at annotations on elements declaration, attribute declaration, and global annotations that belong to a predefined namespace.
  • XML schemas with recursion cannot be used for decomposition. It fails during registration.
  • Referential integrity in the database is not supported.
  • Validation is optional and it might add defaults; changing the document, possibly changing the result of the shredding.
  • Annotated schemas cannot be used for publishing purposes.

XMLPARSE

The biggest issue to avoid is code page conversions that could make the input document encoding different from its encoding declaration, or that could introduce substitution characters (if the target code page cannot represent all input characters).

  • The use of implicit XMLPARSE is highly recommended. The server expects an XML type, and therefore does not do any code page conversions.
    insert into customer (cid, info) values (?,?)
  • The use of explicit XMLPARSE causes the server to expect a character or binary type (based on the cast clause). The use of XMLPARSE, and the cast of a CLOB, cause the document to be codepage converted to the database codepage on the server. This is typically not needed or desirable.
    insert into customer (cid, info) values (?,XMLPARSE(DOCUMENT
    cast(? as CLOB(2M)) preserve whitespace ))
  • If XMLPARSE needs to be used to change the default parsing options, it is recommended to cast the data to a binary type instead of a character type. This avoids codepage conversion issues. Since you are reading the input XML file in streams of raw bytes, there are no codepage conversions on the server or on the Java side. The cast to a BLOB prevents the document to be codepage converted on the server.
    insert into customer (cid, info) values (?,XMLPARSE(DOCUMENT
    cast(? as BLOB(2M)) preserve whitespace ))

DSN_XMLVALIDATE

The following are considerations regarding DSN_XMLVALIDATE:

  • Validation is expensive.
  • Validation against DTDs is not supported.

XMLQUERY

The following are considerations regarding XMLQUERY:

  • While it is not possible to bind runtime predicates in a pure XQuery, it is indirectly possible to apply runtime predicates through the SQL/XML functions or predicates such as XMLQUERY and XMLEXISTS.

XMLEXISTS

The following are considerations regarding XMLEXISTS:

  • Use XMLEXISTS in the where clause rather than XMLQUERY.

  • XMLEXISTS only returns false if the result of the query is an empty sequence (This means that XMLEXISTS(false) returns a true since the result of the query is not an empty sequence.). Therefore, the following query returns all the rows in the customer table because when the customerinfo/name is not Hardeep the query returns a single sequence with a value false, not an empty sequence.
    select c.info from xmlcustomer c where xmlexists('
    $i/customerinfo/name = "Hardeep" ' passing c.info as "i")
  • Use brackets [] in the XMLEXISTS predicates. The previous query would have worked if you had put the name check inside the square brackets, since now for all customerinfo, where name is not Hardeep, an empty sequence is returned from the query and XMLEXISTS returns false if the result is an empty sequence.
    select c.info from xmlcustomer c where xmlexists('
    $i/customerinfo[name = "Hardeep"] ' passing c.info as "i")
  • Combine multiple XML predicates in a single XMLEXISTS. The following two queries are not the same:
    select * from customer where xmlexists('
    $c/customerinfo[name = Hardeep] ‘) and
    xmlexists('$c/customerinfo[phone = 555-555-555] ' passing c.info as “i”)
    
    select * from customer where xmlexists('
    $c/customerinfo[name=Hardeep and phone=555-555-555] ‘ passing c.info as “i”)

    If some document contains two customerinfo elements and one of them has a name=hardeep and phone=111-111-111, while the other has a name=Marja and a phone=555-555-555, then the first query returns this document while the second query does not.


Acknowledgement

Thanks to Matthias Nicola, Henrik Loeser, Ying Chen, and Amyris Rada for their help with this article.

Thanks to Jason Cu and Manogari Simanjuntak for their help with converting the original article to z/OS.

Resources

Learn

Get products and technologies

  • Build your next development project with IBM trial software, available for download directly from developerWorks.

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Information management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management, XML
ArticleID=300446
ArticleTitle=XML application migration from DB2 for z/OS V8 to DB2 9, Part 2 : Comparing XML functionality in DB2 9 to DB2 V8
publish-date=04102008