 | Level: Intermediate John Shelton (johnshel@us.ibm.com), Software engineer,
IBM
Hardeep Singh (hardeep@us.ibm.com), Architect DB2 XML tooling, DB2 XML application migration, IBM
10 Apr 2008 Learn about the XML functionality in DB2® 9 for
z/OS® (DB2 9) and compare it with that of DB2 for z/OS V8. This article
discusses the new XML features introduced in DB2 9, and goes into the details
regarding the impact the new XML support has on migrating existing XML-based
applications. This is the second of a three-article series on migrating your XML
applications from DB2 for z/OS V8 to DB2 9 for z/OS.
Note: These three articles were originally written for workstation DB2 by
Hardeep Singh. They were then modified to be applicable to DB2 for z/OS by John
Shelton.
Introduction
XML support in DB2 for z/OS V8 was based on DB2's relational infrastructure.
Prior to the DB2 9 release, XML data was either stored intact as a character large
object (CLOB) or shredded to relational tables. In contrast, DB2 9 has true native
support for XML data. XML is now handled as a new data type and XML data is stored
in a parsed tree that is separate from the relational data storage. XML indexing
based on an XML pattern has been introduced, along with support for SQL/XML as the
language to query and publish XML data. In order to understand the impact of these
new XML features on migration, you need to compare the different techniques for
storing and querying XML data in DB2 for z/OS V8, to similar or extended XML
functionality available in DB2 9. Based on this comparison, this article
identifies potential areas of the application code that may need to be changed.
XML support in DB2 for
z/OS V8
DB2 V8 provides XML functionality to the underlying relational data using SQL/XML
and the DB2 XML Extender. The DB2 XML Extender provides a number of user-defined
types (UDTs), user-defined functions (UDFs), and stored procedures to enable XML
support. XML documents are stored in a relational form and the XML Extender
infrastructure is used to provide an XML feel to the relational data. A subset of
SQL/XML functions are also supported for publishing relational data as XML. DB2
also allows the creation of user-defined Java™ functions and stored
procedures that can incorporate XML and XSL parsers (installed as apart of DB2 V8)
into the code.
XML support in DB2 9 for
z/OS
DB2 9 support for native XML storage removes the need to shred XML into
relational tables for querying and publishing purposes. The new storage retains
XML in a parsed, annotated tree form that's similar to XML document object model
(DOM). For applications that still need to shred XML data into relational tables,
enhanced shredding capability using annotated XML schema mapping has been added to
this release.
Table 1 compares XML features available in DB2 V8 with those available in DB2 9.
Table 1. XML feature comparison - DB2 V8 and
DB2 9
|
DB2 for z/OS V8
|
DB2 9 for z/OS
| |
Storage and indexing
|
XML data is stored in two forms:
Intact document:
- XML Extender column of type XMLCLOB, XMLVARCHAR, XMLFILE.
- Column of type BLOB, CLOB or VARCHAR.
- Indirect indexes on the XML document are created in relational side
tables.
- The document is parsed to update the side tables created.
Shred to relational XML Extenders (XML Collection):
- Decomposed to existing relational tables using document access
definition (DAD) mapping.
- Only leaf nodes in the incoming XML can be shredded and mapped to SQL
column types.
- Namespaces are not supported.
- Limited capabilities prevent it from handling complex XML documents.
|
XML data is stored in two forms:
Intact document-Native XML store.
- A totally new hierarchical (native) storage built from the ground up.
The storage retains the arbitrary tree form of the document.
- Column of type XML.
- XML indexes using path expressions on leaf nodes.
- The document is parsed at insert.
Shred to relational SQL/XML functions and annotated schema mapping.
- Decomposed to existing relational tables using annotated XML schema
mapping.
- XML fragments can also be shred allowing mapping to XML column type.
- Handles namespaces.
- Extended capabilities, like DB2 expression and conditions, can be used
to filter the XML data.
| |
Validation
|
- Implicit validation defined in the DAD.
- Explicit validation using XML Extender UDF
svalidate.
- Document type definitions (DTDs) registered and stored in internal
tables.
- XML schemas picked from the file system.
|
- No implicit validation is associated with a column.
- Explicit validation using the SQL/XML function
DSN_XMLVALIDATE.
- No DTD validation supported.
- XML schemas are registered in the XML schema repository (XSR), and
stored in the database.
| |
Query and publish
|
Querying and publishing is done using SQL/XML and XML Extender functions.
For intact XML documents.
- Indexed side tables are used to query the document that is retrieved as
a CLOB.
- Subdocument queries can be made using XML Extender functions.
XSLT transformations can be made using Extender functions.
For XML data shredded to relational.
- SQL/XML publishing functions.
- XML Extender DAD mapping and functions.
|
Querying and publishing is done using SQL/XML.
For intact XML documents.
- SQL/XML can be used to retrieve and publish relational and XML data.
- Joins across multiple XML columns are possible.
XSLT support using XML Extender XSLT functions.
For XML data shredded to relational.
- SQL/XML publishing functions.
| |
Updates
|
For intact XML documents.
- Full document update using the SQL
update
statement.
- Updated using XML Extender UDF
update.
|
For intact XML documents.
- Full document update using the SQL
update
statement.
- No support for subdocument updates.
| |
Enable database for XML
|
Need to enable XML Extender functionality for the database.
|
No need to enable the database, since XML support is now an intrinsic part
of the DB2 engine.
| |
Tools
|
No integrated support for XML in the DB2 Development Center or Control
Center.
WebSphere Studio Application Developer
- No support for SQL XML in the query builder.
- No support in the Java Database Connectivity (JDBC) driver.
|
XML support integrated into all aspects of the tools.
DB2 Control Center and DB2 Developers Workbench (Eclipse based).
- Mapping tool for annotated XML schema mapping
- Support for SQL/XML functions in query builder
- XQuery builder - builds full XQuery expressions which my be
incompatible with z/OS DB2
- Visual Studio .NET- added support for DB2 XML
| |
DB2 version
|
DB2 V7 and DB2 V8
DB2 z/OS V7 and later (V8)
iSeries V5R1 and later (V5R3)
|
DB2 9
AIX, NT, LINUX, SOLARIS, DB2 9 for z/OS
|
XML features introduced in DB2 9
Before you dive into a step-by-step migration of a sample application, take a
brief look at DB2 9 for z/OS XML support.
Storage of XML data
While interacting with DB2's native XML support, the abstraction shown in
Figure 1 is a column of type XML in a relational table.
Any well-formed XML statement can be inserted into that column. In the physical
storage layer though, the primary storage unit is a record of nodes.
For tables that have a mix of relational and XML columns, the relational columns
are stored in relational format and XML columns are stored as a parsed tree.
Figure 1. Interacting with
DB2's native XML support
XML index
The XML index is a new index that can provide efficient evaluation of XML path
expressions to improve performance during queries on XML documents.
The XML index uses an XMLPATTERN (an XPath without
predicates, only a child axis (/) and a descendent-or-self axis (//) ) to index
paths and values in XML documents stored in a single XML column. When creating an
index, its possible to specify what paths to index and what type. Any nodes that
match the path expression, or the set of path expressions, in XML stored in that
column are indexed. The index points directly to the node in storage that is
linked to its parent and children for fast navigation.
create index idx3 on dept(deptdoc) generate key using
xmlpattern '/dept/employee/name' as sql varchar(35);
|
XML schema repository
The XSR is a repository for all XML artifacts used to validate and process XML
instance documents stored in XML columns. The XSR removes the additional overhead
required to locate external documents, along with the possible performance impact.
Each DB2 database system contains an XML schema repository that resides in tables
and some system defined stored procedures (for registering) to enter data into
these tables. XML schemas, DTDs and external entities must be explicitly
registered in the XSR before use.
Annotated schema mapping
DB2 9 provides a new shred mechanism using annotated XML schemas. Instead of
creating a separate mapping file (DAD) the XML schemas are annotated with the
relational mapping information. These annotated schemas are then registered to the
XSR. They are used at runtime to decompose XML documents being inserted to the
column. Data types for decomposed XML values are standard relational types. The
shredded relational columns can be indexed using relational indexing. For more
information, refer to
"From DAD to
annotated XML schema decomposition"
(developerWorks, April 2006).
Figure 2 illustrates an example of an annotated schema
mapping. It shows how an entire XML document can be stored in the same record
along with data that has been extracted from the XML document.
Figure 2. Annotated schema
mapping
The intact XML document is stored in an XMLPO column (type XML) along with the
extracted orderID, extracted to the column ORDERID (VARCHAR), and orderDate,
extracted to the column ORDERDATE (DATE).
SQL/XML functions
XML publishing functions
Document construction is the process of composing an XML value out of its small
parts like elements and attributes. SQL/XML provides several functions for this
process that enable construction of XML documents from relational data.
The following SQL/XML functions for publishing existed in DB2 V8
-
XMLAGG
: Aggregates XML values provided as input and returns the
concatenation of all non-null input XML values.
-
XMLATTRIBUTES
: Generates attributes for the XML element being
constructed.
-
XMLCONCAT
: Concatenates non-null input expressions of type XML to return
a sequence of XML values
-
XMLELEMENT
: Constructs an XML element based on input arguments.
-
XMLFOREST
: Constructs a forest of XML elements from input arguments.
-
XMLNAMESPACES
: Constructs namespace declarations for elements generated
by XMLELEMENT or
XMLFOREST.
-
XML2CLOB
and
XMLSERIALIZE
: XML serialization functions convert the
XML value into a string value.
Figure 3. SQL/XML publishing
functions in DB2 V8
DB2 9 supports all the SQL/XML function in DB2 V8 (plus accepting or returning a
new XML type).
The following new SQL/XML publishing functions are also supported:
-
XMLCOMMENT
: Generates an XML comment node.
-
XMLPI
: Generates an XML processing instruction node.
-
XMLTEXT
: Generates an XML text node.
-
XMLDOCUMENT
: Generates an XML document node.
The following are new SQL/XML functions for IN/OUT and validation:
-
XMLPARSE
: Parses character/binary data, and produces XML value.
-
DSN_XMLVALIDATE
: Validates the XML value against an XML schema and type,
and annotates the XML value.
-
IS VALIDATED
: Checks if the XML value has been validated.
XMLPARSE
A string value that represents an XML document can be transformed to a DB2 XML
type using the SQL/XML function XMLPARSE. The function
takes a serialized XML document and converts it into an instance of the XML type.
XMLPARSE is used when the default parsing options need
to be changed.
insert into customer (cid, info) values (?,XMLPARSE(
DOCUMENT cast(? as BLOB(2M)) preserve whitespace ))
|
DSN_XMLValidate
The DSN_XMLValidate function takes an XML value and
validates it against an XML schema, returning an XML value augmented with the
default values and type annotations that are defined in the XML schema. There is
no column level XML schema validation for a native XML column. Validation has to
be explicitly added to each insert or
update statement. This concept allows you to save XML
documents from different XML schemas in the same column.
Note: Validation against DTDs is not supported.
insert into purchaseOrders (pid,porders)
values (2, XMLPARSE(DOCUMENT DSN_XMLVALIDATE(?, 'SYSXSR.PURORD')))
|
XMLSERIALIZE
XML values are processed in an internal representation that is not a string and
not directly comparable to string values. An XML value can be transformed into a
serialized string value representing the XML document using the
XMLSERIALIZE function. This function is quite useful in
environments where the DB2 driver does not support XML type.
The XMLSERIALIZE function returns a serialized XML
value of the specified data type generated from the XML expression argument. The
result has the data type specified by the user. If the result of the XML
expression is null, the result is the null value. The serialized XML value that is
returned must be smaller than 2GB. The serialized result is encoded with UTF-8:
Listing 1. Serialized result
SELECT XMLSERIALIZE ( CONTENT
XMLELEMENT ( NAME "customer",
XMLATTRIBUTES (e.cid AS "id"),
XMLELEMENT (NAME "name", e.name),
XMLELEMENT (NAME "email", e.email),
XMLELEMENT (NAME "zip", e.zip))
AS CLOB(7200)) AS CINFO FROM customer e where cid=1000
|
The following are new SQL/XML functions and predicates for querying XML data:
-
XMLEXISTS
: Determines if an XPath returns a sequence of one or more
items.
-
XMLQUERY
: Executes an XPath, and returns the result sequence.
XMLQUERY
XMLQUERY is a SQL/XML publishing (pseudo) function
whose purpose is to evaluate an XPath expression and return an XML value, that
might be an XPath sequence of zero or more items, to the application.
Figure 4. XMLQUERY query
The query shown in Figure 4 returns only some elements
of the XML data stored in the DESCRIPTION column for every row in the XMLPRODUCT
table.
XMLEXISTS
This XMLEXISTS predicate queries a SQL/XML value
using an XPath expression. If the query returns a sequence that is not empty, the
function returns the true condition, otherwise it returns false.
Figure 5. XMLEXISTS query
Impact of DB2 9 XML support on XML based
applications
An application’s design is influenced by the storage model for the XML data
(intact or shredded). The storage model is influenced by functionality limitations
on storing, querying, and modifying the XML data and any performance requirements.
Therefore, with the introduction of new functionality, the XML storage model can
change since many of the limitations of the DB2 XML Extenders have been overcome.
The rest of the article describes, in detail, the difference in DB2s
capabilities of handling XML data. How to migrate existing applications to DB2 9
in order to use the new feature set is also reviewed.
Set up the database
Right up front you may notice that some of the steps needed for setting up a
database for XML support have changed.
Table 3. Database set up differences between
DB2 V8 and DB2 9 using new XML features
|
|
DB2 V8
|
DB2 9
| |
Enable server for XML
|
XML Extender support needs to be enabled per server by giving the
dxxadm enable_server command.
|
XML type support is intrinsic to DB2
| |
Install any DTD or XML schema needed for validation.
|
All XML schemas used for validation are located on the file system and not
registered.
Any DTDs used for validation need to be registered and stored in a special
DTD_REF table in the database.
|
Any XML schemas used for validation needs to be registered in the XSR.
DTDs cannot be used for validation so in most cases the DTD has to be
changed to an XML schema.
| |
Create XML to relational mapping
|
XML to relational mapping is required for both intact storage (XML stored
in a single column as an XML CLOB) and decomposed storage [data inside the
XML shredded to relational tables]).
A proprietary form of mapping called DAD is used.
|
XML to relational mapping is only required for decomposed storage. A more
standard and efficient technique of mapping called Annotated XML Schema
Mapping is used.
| |
Register and enable the mapping
|
XML columns that store intact XML need to be enabled using the DAD mapping
in order to create the triggers necessary to update the side tables
associated with the column. The command for enabling the XML column is:
dxxadm enable_column.
DAD mapping, which defines the decomposition of an XML document to a
collection of relational tables (XML collection), can be registered and
enabled for use in an XML Extender shredding function. The command to enable
the XML Collection is: dxxadm enable_collection.
|
Columns of type XML, that store native XML data, do not have any mappings
associated with them and do not need to be enabled for storage.
Annotated XML schemas that define the decomposition of an XML document to
relational tables need to be registered in the XSR and enabled for
decomposition. This can be done either at registration time:
COMPLETE XML SCHEMA PRODSCHEMA ENABLE DECOMPSITION
Or enabled after the XML schema has been registered:
ALTER XSROBJECT PRODSCHEMA ENABLE DECOMPOSITION
| |
Create indexes on the XML documents
|
For intact documents in XML columns, indirect indexing can be done by
shredding parts of the data from the XML document into secondary (side)
relational tables, and then indexing these tables to do fast searches on the
shredded data.
Note: These mappings, in effect, define sort of a schema on the XML
column, and if the shape of the incoming document changes, the mapping
information and the side tables need to change.
|
XML documents stored natively in XML columns can be indexed using XML
indexes. These indexes directly locate the stored node in the parsed tree
instead of the document. This XML index creation is a new concept that has
been introduced in DB2 9
Note: Since indexes are not tied to any schemas, or mapping if the
schema changes new indexes, it can be defined without having to drop
existing indexes.
|
In order to understand the migration process, first look at the some basic steps
for creating a database for storing XML data in DB2 V8 and DB2 9.
- Register DTD/XML schemas
DB2 V8. Only DTDs can be registered.
insert into db2xml.dtd_ref (dtdid, content, usage_count, author,
creator, updator) values ( 'customerdtd', db2xml.XMLCLOBFromFile(
'/home2/hardeep/migrate/customer.dtd'), 0, 'xml', 'xml', 'xml');
|
DB2 9. Only XML schemas are used for validation:
Listing 2. Registration of XML schemas for validation
CALL SYSPROC.XSR_REGISTER(
‘SYSXSR’,
‘POschema’,
‘http://myPOschema/PO.xsd’,
:content_host_var,
:docproperty_hos_var)
CALL SYSPROC.XSR_COMPLETE(
‘SYSXSR’,
‘POschema’,
:schemaproperty_host_var,
0)
|
- Create a table with a column for storing XML data.
DB2 V8. Using XML Extender data type XMLCLOB.
CREATE TABLE PurchaseOrder ( POid INTEGER NOT NULL PRIMARY KEY,
Info DB2XML.XMLCLOB)
|
Note: Column can be of type XMLCLOB, XMLVARCHAR, and XML FILE.
DB2 9. Column of type XML.
CREATE TABLE PurchaseOrder ( POid INTEGER NOT NULL PRIMARY KEY,
Info XML)
|
Note: The XML type is a standard SQL type unlike XMLCLOB, XMLVARCHAR,
and XML FILE.
- Create indexes on the XML data.
DB2 Version 8. Create DAD mapping and enables the XML column to create
indirect indexes on the side tables.
dxxadm enable_column –a V81A purchaseorder info
"/home2/hardeep/migrate/po.dad" -r poid
|
DB2 9. Create XML indexes using an xmlpattern.
create unique index order_key on purchaseorder (info) generate key
using xmlpattern '/purchaseOrder/@poid' as sql double;
|
- Create Mappings for shredding incoming XML to relational
DB2 V8. Create DAD mapping and enable the XML collection.
dxxadm enable_collection –a V81A customerinfo
/home2/hardeep/migrate/customer.dad
|
DB2 9. Register the annotated XML schema and enable it for decomposition:
Listing 3. Register the annotated XML schema and enable it for decomposition
CALL SYSPROC.XSR_REGISTER(
‘SYSXSR’,
‘POschema’,
‘http://myPOschema/PO.xsd’,
:content_host_var,
:docproperty_hos_var)
CALL SYSPROC.XSR_COMPLETE(
‘SYSXSR’,
‘POschema’,
:schemaproperty_host_var,
1)
|
 |
Migrating an existing XML database from DB2 V8 to DB2 9
Now that you have some idea about the difference in the way the database is
setup for XML support you can start looking at the basic steps needed to migrate
an existing XML database from DB2 V8 to DB2 9.
First, you need to unload all of your tables containing XML data and load the
data into new tables with XML data type columns. This is a manual operation. This
article only discusses the migration process of DB2 objects that relate to XML
data and leave the rest for you to look up in the DB2 migration documentation.
Migrating DTDs and XML schemas
Since DB2 9 does not use DTDs for XML validation it is necessary to convert any
DTDs used by the current XML application to XML schemas. In order to do this you
first need to load the DTDs from your existing DB2 V8 database (DTD_REF table) to
the local file system.
select db2xml.XMLFileFromCLOB(db2xml.clob(content),
'/home2/hardeep/migrate/customer.dtd') from db2xml.dtd_ref where DTDID='customerdtd';
|
Note: Since XML schemas are not saved in the database, you can get them
from the file system.
Once you have copied the DTD from the DTD_REF table to your file system, you
need to convert it to an XML schema. There are a number of free Web-based
utilities for XML that can help you with the conversions between DTDS and XML
schemas (search using the keywords: DTD Schema convert).
Register the XML schemas
In order to use the XML schemas to validate XML documents, you first need to
register them in the XML schema repository:
Listing 4. Register XML schemas in the XML schema repository
CALL SYSPROC.XSR_REGISTER(
‘SYSXSR’,
‘POschema’,
‘http://myPOschema/PO.xsd’,
:content_host_var,
:docproperty_hos_var)
CALL SYSPROC.XSR_COMPLETE(
‘SYSXSR’,
‘POschema’,
:schemaproperty_host_var,
1)
|
Migrate tables with XML data to the new
database
In DB2 V8, for storing intact XML documents, the column could have been of type
CLOB, BLOB, VARCHAR or XML Extender type DB2XML.XMLCLOB, DB2XML.XMLVARCHAR, or
DB2XML.XMLFILE. For DB2 9 the column type is XML.
Therefore, a table that was created (in DB2 V8) with a CLOB column for storing
XML data:
CREATE TABLE Product(Pid VARCHAR(10) NOT NULL PRIMARY KEY,
Name VARCHAR(128), Category VARCHAR(32),Price DECIMAL(30,2), Info CLOB,
row_id rowid not null generated always);
|
Is now created with an XML column for storing XML data:
CREATE TABLE Product(Pid VARCHAR(10) NOT NULL PRIMARY KEY,
Name VARCHAR(128), Category VARCHAR(32),
Price DECIMAL(30,2), Info XML);
|
Note: In the above table, if the data in the NAME, CATEGORY, and PRICE
columns was extracted from the XML document, for creating relational indexes, then
the new table does not need these columns at all since now you can index into the
stored XML document itself. The schema for the table could be simplified to only
two columns.
CREATE TABLE Product(Pid VARCHAR(10) NOT NULL PRIMARY KEY, Info XML);
|
A table that was created (in DB2 V8) with a DB2XML.XMLCLOB column for storing
XML data:
CREATE TABLE PurchaseOrder(Poid BIGINT NOT NULL PRIMARY KEY,
Status VARCHAR(10) NOT NULL WITH DEFAULT 'New',
Info DB2XML.XMLCLOB NOT NULL, row_id rowid not null generated always);
|
Is now created with an XML column for storing XML data:
CREATE TABLE PurchaseOrder(Poid BIGINT NOT NULL PRIMARY KEY,
Status VARCHAR(10) NOT NULL WITH DEFAULT 'New', Info XML NOT NULL);
|
Import XML data from DB2 V8 to DB2 9
Once you have created all your tables in the new database, the next step is to
export data out of the old database (DB2 V8) tables and import it into the new
database (DB2 9) tables. In order to better understand this process, look at how
the LOAD utility for DB2 9 handles XML data.
The DB2 LOAD utility provides a simple way for you to
populate your XML columns with data from XML files. You can use the LOAD utility
to load XML data. The steps for loading XML data are similar to the steps for
loading other types of data, except that you need to also perform the following
actions:
In the input data set:
- If the data set is in delimited format, ensure that the XML input fields follow
the standard LOAD utility delimited format.
- If the data set is not in delimited format, specify the XML input fields
similar to the way that you specify VARCHAR input. Specify the length of the field
in a 2-byte binary field that precedes the data.
In the LOAD statement:
- Specify the keyword XML for all input fields of type XML.
- If you want the whitespace to be preserved in the XML data, also specify the
keywords PRESERVE WHITESPACE. By default, LOAD strips the whitespace.
UNLOAD XML data from DB2 V8
So for XML data stored in columns of type CLOB, BLOB, VARCHAR, DB2XML.XMLCLOB,
DB2XML.XMLVARCHAR, or DB2XML.XMLFILE, the first step is to
UNLOAD the tablespace:
Listing 5. UNLOAD the tablespace
//UNLOAD JOB ......
//*
//JOBLIB DD DISP=SHR,DSN=DB2A.SDSNLOAD
//*
//PRODUNLD EXEC DSNUPROC,UID='PRODUNLD',UTPROC='',SYSTEM='V81A'
//SYSREC DD DSN=JRS.PRODUNLD.SYSREC,DISP=OLD
//SYSPUNCH DD DSN=JRS.PRODUNLD.SYSPUNCH,DISP=OLD,
//SYSPRINT DD SYSOUT=*
//SYSIN DD *
TEMPLATE LOBFRV DSN 'JRS.PRODUNLD.RESUME'
DSNTYPE(PDS) UNIT(SYSDA)
UNLOAD TABLESPACE JRS.PRODTS NOPAD
FROM TABLE JRS.PRODUCT
(PID VARCHAR(10),
NAME VARCHAR(128),
CATEGORY VARCHAR(32),
PRICE DECIMAL(30,2,),
INFO CLOB TRUNCATE)
//*
|
Note: If you were going to drop the name, category, and price columns from
your new table, you would modify the UNLOAD TABLESPACE to only select the pid and
info columns.
LOAD XML data into DB2 9
The unloaded data can now be loaded to the table created in the new database
using the LOAD utility:
Listing 6. LOAD unloaded data to the created table
//LOADPROD EXEC DSNUPROC,UID='PRODLOAD',UTPROC='',SYSTEM='V91A'
//SYSREC DD DSN=JRS.PRODUNLD.SYSREC,DISP=OLD
//SYSPRINT DD SYSOUT=*
//SYSIN DD *
LOAD DATA INDDN SYSREC LOG NO RESUME YES
EBCDIC CCSID(00037,00000,00000)
INTO TABLE "JRS"."PRODUCT"
WHEN(00001:00002) = X'0009'
IGNOREFIELDS YES
( "PID"
POSITION( 00003:0014) VARCHAR
, "NAME"
POSITION( 00003:0135) VARCHAR
, "CATEGORY"
POSITION( 00136:0172) VARCHAR
, "PRICE"
POSITION( 00173:0188) DECIMAL(30,2)
, "INFO"
POSITION( 00189:00445) XML PRESERVE WHITESPACE
)
|
Note: For XML data shredded to relational columns there is no need to
migrate the data.
Migrate indexes on the XML data
In DB2 V8 XML, data stored in XML columns can have indirect indexes using
relational side tables. The mapping information that associated values in the side
with documents in the XML column is defined in the DAD files.
With DB2 9's ability to create XML indexes over paths inside XML documents, the
need for creating indirect relational indexes using side tables is over. Some of
the key parameters that are used to create the XML index are similar to the
parameters in the DAD mapping. The values from these parameters (in DAD mapping)
can be used in creating the XML indexes. Analyze an example of a DAD mapping used
to create side table for a purchase order column of type DB2XML.XMLCLOB:
Listing 7. Example of DAD mapping
<DAD>
<Xcolumn >
<table name="Shipto">
<column name="name" type="varchar(128)"
path="/purchaseOrder/shipTo/name" multi_occurrence="NO"/>
</table>
|
The parameters of interest in the mapping are: //column/@path and
//column/@type
The comparative XML indexes in DB2 9 are:
create index order_name on purchaseorder (info) generate key using
xmlpattern '/purchaseOrder/ shipTo/name' as SQL VARCHAR(128);
|
The //column/@path value in the DAD is the same as the xmlpattern in the XML
index and the //column/@type in the DAD is the same as the SQL type.
Also of interest is the root ID that is passed to the
enable_column call in DB2 V8 when an XML column is
enabled with the DAD file.
dxxadm enable_column -a V81A migrate purchaseorder info
"/home2/hardeep/migrate/po.dad" -r poid
|
This root ID identifies the name of the single primary key in the table that is
added to the side tables and is used to tie the side tables to the table with the
XML column. This value of the root ID is unique across the XML column. This root
ID corresponds to the path value in the XML document that uniquely defines that
document in the XML column.
So when you create the XML indexes for DB2 9, you can create a unique index on
the path that defined the root ID (primary key), in DB2 V8, if it exists in the
XML document.
create unique index order_key on purchaseorder (info) generate key using
xmlpattern '/purchaseOrder/@poid' as sql double;
|
Migrate mapping files for shredding XML
to relational
If your application needs to shred incoming XML data to relational tables, and
you feel that this need still exists, then you need to replace your existing DAD
based mapping with annotated XML schema based mappings.
Table 4. Difference between XML collection
and annotated XML schema
|
|
XML collection
|
New shred
| |
Mapping files
|
DAD
|
Annotated schema
| |
Functionality and performance
|
Limited
|
Extended functionality and better performance
| |
DTD/schema
|
Not required. It May or may not be registered.
|
Required. It has to be registered.
|
Note: In both type of mappings the relational tables must pre-exist.
Since mapping information about relationships between the XML document and the
relational schema is stored in the DAD files, these DAD files can be used to
understand the annotations required for the XML schemas. For a tool that can
automate the process of converting your DAD files to an annotated XML schema,
refer to
"From DAD to annotated XML schema decomposition."
Once the XML schemas have been annotated, they can be registered to the XSR and
enabled for decomposition:
Listing 8. Register XML schemas to XSR
CALL SYSPROC.XSR_REGISTER(
‘SYSXSR’,
‘POschema’,
‘http://myPOschema/PO.xsd’,
:content_host_var,
:docproperty_hos_var)
CALL SYSPROC.XSR_COMPLETE(
‘SYSXSR’,
‘POschema’,
:schemaproperty_host_var,
1)
|
Migrate DB2 queries
DB2 9 can query both relational and XML data simultaneously using SQL/XML.
As some of the original reasons for shredding the XML documents (fast search
using relational columns) are no longer valid, the relational schema needed to
store the XML data can become much simpler. As a consequence, a part of the
application code that dealt with manipulating the XML data can now be replaced by
a DB2 query that could provide the same functionality inside the database server
using SQL/XML. The impact of this on applications built on top of XML data is
phenomenal:
- Business logic embedded in SQL queries becomes independent of the application
environment. Also, business logic written using SQL/XML can be saved in views or
stored procedures for reuse.
- Middle tiers become leaner and easier to manage.
- Simplifies service oriented architecture (SOA) with the database becoming a
data and service provider.
- Data is moved out of the database only after it has been fully massaged
reducing traffic between database and application.
Some of the application queries that are affected are shown in
Table 5:
Table 5. Affected application queries
|
|
DB2 Version 8
|
DB2 9
| |
Insert
|
Insert calls to store XML documents in DB2XML columns were similar to
insert calls for storing CLOBs:
insert into customer (info) values (?)
XML documents that needed to be shredded to relational tables used XML
Extender shred functions:
dxxInsertXML,dxxShredXML
|
The insert calls remain the same unless the documents need to be
validated. XML schema validation call needs to be explicitly added to the
insert statement.
insert into customer (info) values (XMLPARSE(DOCUMENT DSN_XMLValidate(?,?)))
XML documents that need to be shredded to relational tables use either the
annotated XML schema decompose function.
| |
Update
|
For XML documents that were stored intact (such as a column of type
XMLClob), the subdocument update could be done using the XML Extender UDF
db2xml.update.
The UDF also synced up the update with the corresponding side tables.
|
For XML documents that are stored intact (such as a column of type XML) no
subdocument update function is available out of the box.
You can find a Java stored procedure for doing subdocument updates in the
first article in this migration series,
"Partial updates to XML documents in DB2 9 for z/OS."
DB2XMLFUNCTIONS.XMLUPDATE
| |
Query
|
For an XML document stored in XML Extender columns with side tables, there
are a number of XML Extender extraction functions can be used to get
subdocument data.
|
For an XML document stored in XML columns, SQL/XML functions can be used
to query into the XML documents. There are a number of SQL/XML functions
that support XPath:
| |
Publish
|
An XML document stored intact could be transformed using XSLT UDF or
serialized using XMLSERALIZE, XML2CLOB.
Shredded XML data could be published as XML using either DAD mapping in
conjunction with XML Extender publish UDFs, or SQL/XML publishing functions.
|
An XML document stored intact in XML columns can be published and
transformed using SQL/XML or serialized to a SQL string using
XMLSERIALIZE
For shredded XML data you need to use SQL/XML publishing functions in the
query.
|
Inserting XML data to DB2
Intact XML stored in a single columns
If XML data was stored intact in the existing application, then there should not
be a major impact on the application code and the DB2 queries dealing with the XML
data insertion.
DB2 V8 Storing data into a CLOB, VARCHAR, XMLCLOB, or XMLVARCHAR:
insert into customer (info) values (?)
|
DB2 9. Column of type XML:
The code would remain the same unless you needed to parse or validate the data.
insert into customer (info) values (?)
|
In case you need to validate the data against a registered XML schema, you would
add the DSN_XMLVALIDATE to the
insert statement.
insert into customer (info) values (DSN_XMLValidate(?,?))
|
Note: In XML Extender, validation can be stated implicitly in the DAD
mapping for the side tables.
If you need to override the default parsing of the XML document, you can specify
the parsing options using the XMLPARSE in the
insert statement.
insert into customer (info) values (XMLPARSE(DOCUMENT
cast(? as BLOB(2M)) preserve whitespace ))
|
Note: The XML data that was stored intact could also have been shredded and
mapped to relational tables for querying purposes. This could have been
accomplished using DAD mapping for XML column side tables, or it could have been
done in the application code. Since, in DB2 9, you can directly index and search
into nodes inside the XML document, there is no need to shred the XML to
relational for the sole purpose of querying. Application code dealing with the
shredding could be redundant.
XML shredded to relational tables
For XML documents where the need to shred exists, the queries and application
code dealing with the data insertion and shredding changes slightly.
DB2 V8. XML documents shred using XML Extender shred functions
dxxInsertXML and
dxxShredXML.
Call dxxShredXML (?, ?,?,? );
Call dxxInsertXML(customerdad, ?,?,? );
|
DB2 9. XML documents shred using annotated XML schema mapping.
Call xdbDecompXML (NULL,migrate.customer,?,?,1,NULL,NULL,NULL)
|
If the shredding is being done inside the application using a DOM, you don't
need to change anything unless you want to migrate the shredding to annotated XML
schemas.
Update XML documents stored in DB2
DB2 V8 XML Extender provides a UDF to allow partial updates to XML documents in
DB2. Update takes a location path to specify the
element or attribute that is to be updated. It returns a data type similar to the
column being updated. The content of every element or attribute in the XML
document that matches the location path is updated with the supplied value.
Update purchaseorder set status='shipped', info=db2xml.update(
info,'/purchaseOrder/@status','shipped') where poid=2001;
|
Since the Update UDF handles one location path at a
time, in order to do more extensive changes to the document, it needs to be
retrieved to the client application, modified, and then inserted back to the
database using a SQL update statement.
Note: The update trigger on XML Extender columns
also updates the corresponding side tables, in order to keep the data in the XML
column in sync with the data in the side tables.
DB2 9 does not have any functions to do subelement updates to an XML document
inside the database. DB2 did not want to introduce short term proprietary
functions into the language to support this.
A workaround to this problem is a Java stored procedure that can be found in the
article
"Partial updates to XML documents in DB2 9 for z/OS." This stored procedure can do partial updates to XML
documents inside DB2. The update stored procedure is also capable of deleting,
replacing, and appending nodes in the XML document.
Note: Under the covers, the stored procedure does a full document update.
Listing 9. Document update
Call DB2XMLFUNCTIONS.XMLUPDATE (
'<updates><update action="replace" col="1"
path="/purchaseOrder/@status">shipped </update></updates>',
'Select info from purchaseorder where poid=2001',
'update purchaseorder set info=? where poid=2001',?,?);
|
Note: Since the source code is also published, the stored procedure can be
used as is or as a model for writing your own Java-based stored procedures to
manipulate XML data inside DB2.
Query and publish XML documents stored
in DB2
Application code dealing with querying and publishing is the one that is
affected the most. For XML data stored in XML Extender columns, the SQL queries
change completely, since they involve the side tables, which are no longer
maintained.
The following SQL queries two tables to return the details for products that
have thumbnail images:
Select SQLPRODUCT.DETAILS, from SQLIMAGES,SQLPRODUCT
where SQLPRODUCT.PID=SQLIMAGES.PID and SQLIMAGES.TYPE='thumbnail'
|
This could be replaced with a single query over a single column:
SELECT XMLQUERY('/product/description/images/image/@type = "thumbnail"'
PASSING DESCRIPTION)
FROM XMLPRODUCT
|
Extract and query XML data inside the XML document
DB2 XML Extender extract UDFs provide a number of
scalar and table function to extract data from inside an XML document. Any SQL
statement involving extraction of data from an XML document using the DB2
extract UDFs also have to be replaced with equivalent
calls to XMLEXIST or
XMLQUERY.
select db2xml.extractVarchar(info,'/customerinfo/name') from xmlcustomer
|
Can be replace with:
Select XMLQuery('/customerinfo/name'
passing info) from xmlcustomer xmlcustomer
|
Although, this is not exactly the same since XMLQuery will return an XML
sequence.
select info from customers where db2xml.extractvarchar(info,'/customerinfo/name')=?
|
Can be replaced with:
select info from xmlcustomer where XMLExists('
$cinfo/customerinfo[name=$name]' passing info as "cinfo",
cast(? as varchar(32)) as "name")
|
DB2 XML Extenders provide ExtractCLOB () and
ExtractCLOBs () UDFs to extract fragments of an XML
document.
select db2xml.extractclob(info,'/customerinfo/addr') as RESULT from
customer where cid = ?
|
Can be replaced with the XML decomposition stored procedure (XDBDECOMPXML).
Transform and publish XML data
Besides the extraction UDF that can be used to get to
subdocument information, DB2 XML extenders also provide an XSLT UDF to transform
XML documents inside the database.
The two XSL UDFs defined in DB2 XML Extenders are
XSLTransformToClob() and
XSLTransformToFile().
SELECT XSLTransformToClob( info, /temp/xslfilename.xsl) FROM xslcustomers
SELECT XSLTransformToClob( info, ?) FROM xslcustomers
Pass the XSL as a CLOB at runtime
|
Although DB2 9 has the XSL UDF as a part of the XML extender support, there is
not any built-in function for XSL transformation (This means that you need to
enable the database for XML extenders in order to use the XSLT UDFs.).
Transform and publish relational data as XML
DB2 XML Extender DAD files and stored procedures for collections provide
bidirectional mapping. The same mapping can be used for shredding and publishing.
Unfortunately, the annotated XML schema functionality does not provide any
publishing capabilities. So the only way to publish is to use SQL/XML function.
Considerations
This section contains some of the things you need to consider while migrating.
These could be limitations or best practices that could help you in making your
decisions when you migrate your XML application. For a comprehensive list, refer
to the documentation for DB2 9.
Restriction on XML columns
The following are restrictions on XML columns:
- An XML column cannot be included as columns of keys, including primary,
foreign, and unique keys, sequence keys of clustered index, and distribution or
partitioning keys.
- An XML column cannot be part of any index except an index over XML data.
- An XML column cannot have a default value specified by the WITH DEFAULT
clause; if the column is nullable, the default for the column is NULL.
- An XML column cannot be used in a materialized query table.
- An XML column cannot be referenced in CHECK constraints.
- An XML column cannot be referenced in generated columns.
- An XML column cannot be referenced in triggers.
XML schemas
The following are considerations regarding XML schemas:
- Since no XML schema is associated with an XML column, it is technically
possible to insert XML documents from diverse schemas in a single column (schema
chaos).
Indexes
An XML index can be used for a query predicate only if:
- The index contains the query predicate. For example, if the index is equally
or less restrictive than the predicate. This includes namespaces.
- The query predicate matches the index data type.
-
/text() is used consistently in predicates and index
definitions.
Wherever possible, use fully specified XPaths. Avoid wildcards
(*, //), avoid
/text().
-
/customerinfo/phone instead of
//phone
-
/customerinfo/addr/state instead of
/customerinfo/*/state
-
/customerinfo/name instead of
/customerinfo/name/text()
Annotated XML schemas
- Annotations do not contribute to the validation of the instance document.
- Most of the XML schema components can have annotations.
- Decomposition only looks at annotations on elements declaration, attribute
declaration, and global annotations that belong to a predefined namespace.
- XML schemas with recursion cannot be used for decomposition. It fails during
registration.
- Referential integrity in the database is not supported.
- Validation is optional and it might add defaults; changing the document,
possibly changing the result of the shredding.
- Annotated schemas cannot be used for publishing purposes.
XMLPARSE
The biggest issue to avoid is code page conversions that could make the input
document encoding different from its encoding declaration, or that could introduce
substitution characters (if the target code page cannot represent all input
characters).
- The use of implicit
XMLPARSE is highly
recommended. The server expects an XML type, and therefore does not do any code
page conversions.
insert into customer (cid, info) values (?,?)
|
- The use of explicit
XMLPARSE causes the server to
expect a character or binary type (based on the cast clause). The use of
XMLPARSE, and the cast of a CLOB, cause the document to be codepage converted to
the database codepage on the server. This is typically not needed or desirable.
insert into customer (cid, info) values (?,XMLPARSE(DOCUMENT
cast(? as CLOB(2M)) preserve whitespace ))
|
- If XMLPARSE needs to be used to change the default parsing options, it is
recommended to cast the data to a binary type instead of a character type. This
avoids codepage conversion issues. Since you are reading the input XML file in
streams of raw bytes, there are no codepage conversions on the server or on the
Java side. The cast to a BLOB prevents the document to be codepage converted on
the server.
insert into customer (cid, info) values (?,XMLPARSE(DOCUMENT
cast(? as BLOB(2M)) preserve whitespace ))
|
DSN_XMLVALIDATE
The following are considerations regarding
DSN_XMLVALIDATE:
- Validation against DTDs is not supported.
XMLQUERY
The following are considerations regarding XMLQUERY:
- While it is not possible to bind runtime predicates in a pure XQuery, it is
indirectly possible to apply runtime predicates through the SQL/XML functions or
predicates such as XMLQUERY and XMLEXISTS.
XMLEXISTS
The following are considerations regarding XMLEXISTS:
-
Use XMLEXISTS in the where clause rather than
XMLQUERY.
-
XMLEXISTS only returns false if the result of the
query is an empty sequence (This means that XMLEXISTS(false) returns a true
since the result of the query is not an empty sequence.). Therefore, the
following query returns all the rows in the customer table because when the
customerinfo/name is not Hardeep the query returns a single sequence with a
value false, not an empty sequence.
select c.info from xmlcustomer c where xmlexists('
$i/customerinfo/name = "Hardeep" ' passing c.info as "i")
|
- Use brackets [] in the
XMLEXISTS predicates. The
previous query would have worked if you had put the name check inside the square
brackets, since now for all customerinfo, where name is not Hardeep, an empty
sequence is returned from the query and XMLEXISTS returns false if the result is
an empty sequence.
select c.info from xmlcustomer c where xmlexists('
$i/customerinfo[name = "Hardeep"] ' passing c.info as "i")
|
- Combine multiple XML predicates in a single
XMLEXISTS. The following two queries are not the same:
select * from customer where xmlexists('
$c/customerinfo[name = Hardeep] ‘) and
xmlexists('$c/customerinfo[phone = 555-555-555] ' passing c.info as “i”)
select * from customer where xmlexists('
$c/customerinfo[name=Hardeep and phone=555-555-555] ‘ passing c.info as “i”)
|
If some document contains two customerinfo elements and one of them has a
name=hardeep and phone=111-111-111, while the other has a name=Marja and a
phone=555-555-555, then the first query returns this document while the second
query does not.
 |
Acknowledgement
Thanks to Matthias Nicola, Henrik Loeser, Ying Chen, and Amyris Rada for their
help with this article.
Thanks to Jason Cu and Manogari Simanjuntak for their help with converting the
original article to z/OS.
Resources Learn
Get products and technologies
- Build your next development project with IBM trial software, available for download directly from developerWorks.
Discuss
About the authors  | |  | John Shelton is a member of the DB2 XML Extender team. |
 | 
|  | Hardeep Singh is a member of the DB2 Native XML team. He is the architect for DB2 XML tooling and is responsible for XML Extender application migration to Viper. He has more than 21 years of industry experience. |
Rate this page
|  |