(Updated on April 22, 2009 PST)
I'd like to share with you the following noteworthy courses, articles, and news related to DB2 pureXML:
You may have heard and seen quite some value claims about pureXML in DB2 9, and a lot of details about the pureXML features. You may wonder how they connect in the real world. I will try to present the scenarios and how pureXML features can play an important role, abstracted from real-world experiences, at the teleconference to be held on April 21, 2009. Here is the invitation in PDF. I will be speaking and answering your questions live. The presentation focuses on the scenarios that pureXML can deliver the values and how you can use pureXML to:
- Accelerate application development,
- Achieve greater business agility, and
- Improve business insights.
In addition, System z has its well-known advantages of reliability, availability, scalability, security, and savings in space, power consumption, and people cost. DB2 9 for z/OS pureXML and other XML capabilities on System z make your applications less reliant on additional systems to handle XML, reducing system complexity and improving performance.
In today's extremely competitive environment, time-to-market may determine life or death of a business, or to a less extent, profit or loss. In that kind of situation, using feasible technologies to deliver a service quickly becomes critical. And pureXML is one element of this type of technologies.
Please invite your clients, colleagues, and friends to register for this teleconference: ibm.com/software/systemz/telecon/apr21. See you then.
XML is all about its flexibility, and performance, of course, in dealing with the flexibility. XML documents that are stored in a DB2 XML column are untyped. You can specify the types in XPath expressions explicitly (using cast, such as xs:double() ) or implicitly (through operators). In many cases, explicit type casting can be avoided in XPath because of implicit casting. However, although general comparisons take sequences as operands, arithmetic operators and cast do not support sequences as operands, which could lead to cardinality errors that should be taken care of. Here I will illustrate when type casting is needed, and how to avoid cardinality errors using some examples. You may have seen the highlights of the query examples in my presentations before.
Let's assume we have created a table and have inserted one document as follows:
CREATE TABLE CATALOG(XCAT XML);
INSERT INTO CATALOG VALUES('<?xml version="1.0"?><catalog> <category cid = "001" name = "Electronics"> <product pid="101"> <name>Personal computer</name> <model>ThinkCenter 100</model> <RegPrice>1295</RegPrice> <SalePrice>998.95</SalePrice> </product> <product pid="102"> <name>IPOD</name> <model>30G</model> <RegPrice>298</RegPrice> <SalePrice>268.95</SalePrice> </product> </category> <category cid = "002" name="Apparel"> <product pid="201"> <name>Shirt</name> <description>Man''s Demin</description> <size>S</size> <size>M</size> <size>L</size> <size>XL</size> <RegPrice>60.00</RegPrice> <SalePrice>39.99</SalePrice> </product> <product pid="202"> <name>Shoe</name> <description>Man''s Leather</description> <size>8</size> <size>9</size> <size>10</size> <RegPrice>83.00</RegPrice> <SalePrice>49.99</SalePrice> </product> </category></catalog>');
Taking care of types
Example 1. "Find all the products in the Catalog with RegPrice > 100"
-- No explicit cast is needed because of the numeric comparisonSELECT XMLQUERY('/catalog/category/product[RegPrice > 100]/name' PASSING XCAT)FROM CATALOG;
-- result:<name>Personal computer</name><name>IPOD</name>
Example 2. "Find all the products on sale in the Catalog"
-- Explicit cast xs:double() is needed. Only one at either side of the comparison is enough.SELECT XMLQUERY('/catalog/category/product[RegPrice > xs:double(SalePrice) ]/name' PASSING XCAT)FROM CATALOG;
-- result:<name>Personal computer</name><name>IPOD</name><name>Shirt</name><name>Shoe</name>
-- If no cast, the comparison will be string comparison:SELECT XMLQUERY('/catalog/category/product[RegPrice > SalePrice]/name' PASSING XCAT)FROM CATALOG;
-- result: not intended - missing Personal computer because "1295" < "998.95"<name>IPOD</name><name>Shirt</name><name>Shoe</name>
Example 3. "Find all the products with more than 10% discount in the Catalog"
--No explicit cast is needed because of arithmetic multiplicationSELECT XMLQUERY('/catalog/category/product[RegPrice * 0.9 > SalePrice ]/name' PASSING XCAT)FROM CATALOG;
-- result:<name>Personal computer</name><name>Shirt</name><name>Shoe</name>
Example 4. "Comparisons tolerate type errors"
The following examples show comparison operators tolerate type errors (treat as comparison fails). Note that the "size" elements contain numeric values as well as letters.
-- Find product with size greater than or equal to 9SELECT XMLQUERY('/catalog/category/product[size >= 9]/name' PASSING XCAT)FROM CATALOG;
-- Find product with size equal to "L" or "XL"SELECT XMLQUERY('/catalog/category/product[size = "L" or size = "XL"]/name' PASSING XCAT)FROM CATALOG;
You could use numeric comparison and string comparison in the same query:
/catalog/category/product[size >= 9 or size = "L" or size = "XL"]/name
which results in what you would expect.
Avoiding cardinality errors
Let's assume we keep some regular price and sales price history for the products, and make the document look like the following:
INSERT INTO CATALOG VALUES('<?xml version="1.0"?><catalog> <category cid = "001" name = "Electronics"> <product pid="101"> <name>Personal computer</name> <model>ThinkCenter 100</model> <RegPrice from="2005-06-01">1295</RegPrice> <SalePrice from="2006-01-10" to="2006-01-31">998.95</SalePrice> <SalePrice from="2006-02-01" to="2006-02-15">1059.98</SalePrice> </product> <product pid="102"> <name>IPOD</name> <model>30G</model> <RegPrice from="2006-01-01" to="2006-09-30">298</RegPrice> <RegPrice from="2006-10-01">269</RegPrice> <SalePrice from="2006-06-08" to="2006-06-30">268.95</SalePrice> </product> </category>...<catalog>');
Let's revisit the examples above. Example 1 does not have a problem as the general comparison deals with sequences. However, Examples 2 and 3 encounter cardinality errors.
Example 5. Example 2 causes an error on the new document: xs:double() does not take a sequence of more than one item (two SalePrice nodes).
SELECT XMLQUERY('/catalog/category/product[RegPrice > xs:double(SalePrice) ]/name' PASSING XCAT)FROM CATALOG;
sqlcode : -16003 sqlstate: 10507sqlerr Message: AN EXPRESSION OF DATA TYPE ( item(), item()+ ) CANNOT BE USED WHEN THE DATA TYPE xs:double IS EXPECTED IN THE CONTEXT. ERROR QNAME=err:XPTY0004
Note that ( item(), item()+ ) means a sequence of more than one items. To avoid the cardinality problem, we can use "self" in the following XPath expression, where "SalePrice" could result in multiple items, but "xs:double(.)" takes one item at a time:
SELECT XMLQUERY('/catalog/category/product[RegPrice > SalePrice/xs:double(.) ]/name' PASSING XCAT)FROM CATALOG;
-- result:<name>Personal computer</name><name>IPOD</name><name>Shirt</name><name>Shoe</name>
Example 6. Example 3 causes an error on the new document because arithmetic multiplication does not take a sequence of more than one item (two RegPrice nodes).
SELECT XMLQUERY('/catalog/category/product[RegPrice * 0.9 > SalePrice ]/name' PASSING XCAT)FROM CATALOG;
sqlcode : -16003 sqlstate: 10507sqlerr Message: AN EXPRESSION OF DATA TYPE ( item(), item()+ ) CANNOT BE USED WHEN THE DATA TYPE xdt:numeric IS EXPECTED IN THE CONTEXT. ERROR QNAME=err:XPTY0004
Similarly, we can use "self" in the following XPath expression to avoid the cardinality error:
SELECT XMLQUERY('/catalog/category/product[RegPrice/(. * 0.9) > SalePrice ]/name' PASSING XCAT)FROM CATALOG;
-- Result:<name>Personal computer</name><name>Shirt</name><name>Shoe</name>
Since an attribute value is treated as xs:untypedAtomic, there is no cardinality error possible on attributes.
I hope the above examples help you understand the flexibility of XPath in terms of types. And I hope you've learned the "self trick," and can take measure now to avoid possible cardinality errors in SQL/XML queries when you make the documents more flexible in the future.
Migration of DB2 for z/OS systems to a new release is a very important task that requires careful planning and systematic execution, which takes time. Yet pureXML in DB2 9 for z/OS is very attractive to application developers, but it requires DB2 9 NFM. If you have a need to start an application development using pureXML and cannot seem to wait for the DB2 systems to get to DB2 9 NFM, you may want to consider the following option (which to many is not news): developing the new application using DB2 Express-C using the pureXML features and IBM Data Studio on a Windows or Linux platform (both are no charge). You may get in sync with the migration to DB2 9 NFM for application system testing, performance testing, and deployment. Here is some more info.
DB2 Express-C is a no-charge community edition of the DB2 for LUW data server. It can be downloaded from DB2 Express-C download. Before you start, you may want to look at the Getting Started book.
You also need to download the right version of DB2 Express-C for your platform. If you have a Windows platform, look at this page to determine if your Windows is 32-bit or 64-bit. If the properties of your computer does not say 64-bit, then it's likely to be 32-bit.
You also need an IBM ID to download for license agreement. Getting an IBM ID is technically very easy. Use IBM Installation Manager to download and install, which is faster. After download, you need to unzip it and run the setup.exe or something like that. It should be fine to use the typical installation type. Extremely easy :-).
IBM Data Studio Developer
To develop using SQL/XML, Java, stored procedures, and pureQuery, you can start with the Data Studio Developer no-charge features. After you install DB2 Express-C and launch it, you can download Data Studio Developer and install it. The link is listed on the DB2 Express-C download page also. Again, it's straight-forward.
Using Data Studio Developer to explore SQL/XML queries
This is my Windows XP experience. Launch the Data Studio Developer, you will see the SAMPLE database already listed in the Data Source Explorer, if you launched DB2. Right click on the SAMPLE DB, and choose "connect". Use your Windows username and password to connect. Once it's connected, you can look into "Schemas", and under your username, find "Tables". There are a few tables in the SAMPLE DB that contains an XML column: CUSTOMER, PRODUCT, and PURCHASEORDER that contains some sample data. You can list the table content by right click on the tablename -> "Data" -> "Returns all rows" or "Sample content", to display table content.
From SAMPLE DB name, right click on it -> "New SQL Script", you can copy and paste an SQL/XML query into the window. Then Choose main menu "Run" -> "Run SQL". The status and result will show. Try this query:
SELECT *FROM PURCHASEORDERWHERE XMLEXISTS('declare default element namespace "http://posample.org"; $po/PurchaseOrder[@Status="Unshipped"]' PASSING PORDER AS "po")
Remember, in order to move the queries and applications to z/OS easily, you need to stick to the syntax and scope that is supported by z/OS pureXML (and also by LUW). One of the key detailed differences is that in LUW, you must use a variable, such as $po shown above, as the starting context item for an XPath path expression, which is optional for z/OS. This is consistent for XMLQUERY, XMLEXISTS, and XMLTABLE functions.
Once you see the XML column data, you can click on the icon, and an XML editor window will open.
You could open a new Data Project, from main menu "File" -> "New" -> "Data development Project". Say you created Project1. Within Data Project Explorer, under Project1, right clock on "SQL Scripts" -> "New" -> "SQL or XQuery Script", you can get a SQL Script window also.
If you need to have a Java project, you need to create a project with the corresponding type: from main menu "File" -> "New" -> "Project", a dialogue window will open...
I hope you will like the combination of Data Studio with pureXML, if you haven't tried yet. There are much more features to explore in IBM Data Studio.
After careful review of the XML features in DB2 9 for z/OS and their user experiences and feedback, the XML development team decided that the XDBDECOMPXML stored procedure for annotated schema-based decomposition be deprecated in DB2 9 for z/OS and may be removed in a future release. Users are advised to use the more versatile and powerful standard SQL built-in XMLTABLE function instead for decomposition and other XML processing needs in DB2 9. If you have any concerns, please let us know. You can reach me by emails through db2zxml at(@) us.ibm.com or gzhang at(@) us.ibm.com.
For the alternative, the XMLTABLE function can take as input an XML hostvar, such as :xmlhv, a parameter marker, such as CAST(? AS XML), XMLPARSE(DOCUMENT non_xml_expr), or an XML column, and produce a logical table, which can contain both non-XML and XML results. See SQL Reference XMLTABLE description and the XMLTABLE Category in this blog for other postings related to XMLTABLE usage and pointers.
There are some shortcomings of a XDBDECOMPXML based decomposition solution:1. It requires a schema to be annotated. Schema annotation fixes on a given set of table names - no change possible at runtime.2. Intrinsic annotation requires tedious work when a schema evolves. Such a solution cannot take the advantage of XML schema flexibility provided by pureXML.3. It requires schema validation during decomposition, with slow performance, high CPU cost and no zIIP/zAAP possible. In contrast, the XMLTABLE function does not require a schema, and is eligible for zIIP re-direction, and schema validation, if needed, can be performed in a separate DSN_XMLVALIDATE function that will be eligible for zIIP/zAAP redirection.
Deprecation and future removal of XDBDECOMPXML and XDBDECOMPXML100M will allow us to focus on enhancing the more valuable pureXML features and deliver more value to the customers with better solutions.
In the XMLQuickDemo, I illustrated some simple DDL statements and SQL/XML queries. In the second CLP file (2ListObjects.CLP), I tried to list some database objects related to an XML column. I was a bit lazy for the first query used to find the tables created, as I knew the implicitly created XML table name is "X"+<base table name> for the first XML column (this is only true for the first XML column of the table and the table name is shorter than 18 bytes). So I used this:
WHERE NAME LIKE '%PUR%' AND CREATOR = CURRENT SQLID;
as the WHERE predicate to find the tables just created, including the base table PURCHASEORDERS and internal XML table XPURCHASEORDERS. However, the right query should use the catalog table SYSIBM.SYSXMLRELS to find the XML table for an XML column. Let's assume I have the following two DDL statements to create a table and an XML index, and my SQL ID is DDS0075:
CREATE TABLE MYTABLE(XDOC XML);CREATE INDEX MYINDEX ON MYTABLE(XDOC) GENERATE KEYS USING XMLPATTERN '/form/formdata/data1/*' AS SQL VARCHAR(20);
I should use the following query to find the XML table corresponding to the XML column(s):
SELECT SUBSTR(XMLTBOWNER, 1, 8) as CREATOR , SUBSTR(XMLTBNAME,1,20) as NAME , T.TYPE , T.DBNAME , T.TSNAME , T.COLCOUNTFROM SYSIBM.SYSXMLRELS XT , SYSIBM.SYSTABLES TWHERE XT.TBNAME = 'MYTABLE' -- table name AND XT.TBOWNER = CURRENT SQLID -- creator name AND T.NAME = XT.XMLTBNAME AND T.CREATOR = XT.XMLTBOWNER;
The SUBSTR function is used to shorten the result length for display. The result looks like the following:
CREATOR NAME TYPE DBNAME TSNAME COLCOUNTDDS0075 XMYTABLE P DSN01018 XMYT0000 3 1 record(s) selected
In DB2 9, by default, each table will be created in its own database (when the tablespace is not specified also, see the table below). The XML tables related to the base table will be always in the same database as the base table. It may not be the best to use this default feature, but for simplicity, I use this for the example. Here are more query examples on the catalog tables to find information about other XML objects and their sample results. They are self-explanatory, except for the characters for the types:
- 'P' for both XML table and XML table space;
- 'N' for NodeID index, and
- 'V' for XML Value index.
- (DOCID index is a regular index on the base table.)
ECHO *****************************************************************;ECHO * GET INFO ABOUT XML TABLE SPACES AND STOGROUPS *;ECHO *****************************************************************;SELECT TP.DBNAME , TP.TSNAME , TP.STORNAME , TP.PQTY , TP.SQTYFROM SYSIBM.SYSXMLRELS XT , SYSIBM.SYSTABLES TB , SYSIBM.SYSTABLEPART TPWHERE XT.TBNAME = 'MYTABLE' -- table name AND XT.TBOWNER = CURRENT SQLID -- table creator AND TB.NAME = XT.XMLTBNAME AND TB.CREATOR = XT.XMLTBOWNER AND TP.TSNAME = TB.TSNAME AND TP.DBNAME = TB.DBNAME;DBNAME TSNAME STORNAME PQTY SQTYDSN01018 XMYT0000 SYSDEFLT -1 -1 1 record(s) selected
ECHO ********************************************************************;ECHO * GET INFO ABOUT INDEXES, INDEX SPACES AND STOGROUPS ON BASE TABLE *;ECHO ********************************************************************;SELECT SUBSTR(IX.TBNAME,1,20) as TBNAME , SUBSTR(IX.NAME,1,26) as IXNAME , IX.IX_EXTENSION_TYPE as IXTYPE , IX.DBNAME , IX.INDEXSPACE , IP.STORNAMEFROM SYSIBM.SYSINDEXES IX , SYSIBM.SYSINDEXPART IPWHERE IX.TBNAME = 'MYTABLE' -- base table name AND IX.TBCREATOR = CURRENT SQLID -- table creator AND IX.NAME = IP.IXNAME AND IX.TBCREATOR = IP.IXCREATOR;-- INCLUDES DOCID INDEX, BUT NOT THE NODEID INDEX:TBNAME IXNAME IXTYPE DBNAME INDEXSPACE STORNAMEMYTABLE I_DOCIDMYTABLE DSN01018 IRDOCIDM SYSDEFLTMYTABLE MYINDEX V DSN01018 MYINDEX SYSDEFLT 2 record(s) selected
ECHO ********************************************************************;ECHO * GET INFO ABOUT INDEXES, INDEX SPACES AND STOGROUPS ON XML TABLE *;ECHO ********************************************************************;SELECT SUBSTR(IX.TBNAME,1,20) as TBNAME , SUBSTR(IX.NAME,1,26) as IXNAME , IX.IX_EXTENSION_TYPE as IXTYPE , IX.DBNAME , IX.INDEXSPACE , IP.STORNAMEFROM SYSIBM.SYSXMLRELS XT , SYSIBM.SYSINDEXES IX , SYSIBM.SYSINDEXPART IPWHERE XT.TBNAME = 'MYTABLE' -- base table name AND XT.TBOWNER = CURRENT SQLID -- table creator AND IX.TBNAME = XT.XMLTBNAME AND IX.TBCREATOR = XT.XMLTBOWNER AND IX.NAME = IP.IXNAME AND IX.TBCREATOR = IP.IXCREATOR;-- NODEID INDEX ONLYTBNAME IXNAME IXTYPE DBNAME INDEXSPACE STORNAMEXMYTABLE I_NODEIDXMYTABLE N DSN01018 IRNODEID SYSDEFLT 1 record(s) selected
-- ALL THE TABLES IN A DATABASE:SELECT SUBSTR(CREATOR, 1, 8) as CREATOR , SUBSTR(NAME,1,20) as NAME , T.TYPE , T.DBNAME , T.TSNAME , T.COLCOUNTFROM SYSIBM.SYSTABLES TWHERE DBNAME = 'DSN01018';
CREATOR NAME TYPE DBNAME TSNAME COLCOUNTDDS0075 MYTABLE T DSN01018 MYTABLE 2DDS0075 XMYTABLE P DSN01018 XMYT0000 3 2 record(s) selected
-- ALL THE INDEXES IN A DATABASE:SELECT SUBSTR(IX.TBNAME,1,20) as TBNAME , SUBSTR(IX.NAME,1,26) as IXNAME , IX.IX_EXTENSION_TYPE as IXTYPE , IX.DBNAME , IX.INDEXSPACE , IP.STORNAMEFROM SYSIBM.SYSINDEXES IX , SYSIBM.SYSINDEXPART IPWHERE IX.DBNAME = 'DSN01018' AND IX.NAME = IP.IXNAME AND IX.TBCREATOR = IP.IXCREATOR;
TBNAME IXNAME IXTYPE DBNAME INDEXSPACE STORNAMEMYTABLE I_DOCIDMYTABLE DSN01018 IRDOCIDM SYSDEFLTXMYTABLE I_NODEIDXMYTABLE N DSN01018 IRNODEID SYSDEFLTMYTABLE MYINDEX V DSN01018 MYINDEX SYSDEFLT 3 record(s) selected
Please note that, as you may well know, for utilities, there is a better way to define the relevant objects using LISTDEF. You can use REPORT TABLESPACESET to list relevant objects too. The above queries can help you to find specific information. If you need to find the storage space usage, you can run the STOSPACE utility on STOGROUPs, and then add the space used by XML objects. Or better yet, if you group tables with database as units, you can calculate the space based on DBNAME to include XML object space automatically. For example, the following query displays the number of table spaces and also the corresponding total storage space, including XML table spaces, for each of the databases:
SELECT DBNAME, COUNT(*), SUM(SPACE) -- use SPACEF if necessaryFROM SYSIBM.SYSTABLESPACEGROUP BY DBNAME;
If you just want to find XML table space information, use this:
SELECT DBNAME, COUNT(*), SUM(SPACE)FROM SYSIBM.SYSTABLESPACEWHERE TYPE = 'P'GROUP BY DBNAME;
To help you understand how implicitly created XML objects impact your naming convention for physical database design and administration, the following table lists some of the attributes related to XML objects under different CREATE TABLE options (probably not exhaustive). The short notations used in the table are the following:
NM: Name, TS: Table Space, ST: Stogroup, BP: Buffer pool, TB: Table, IS: Index Space, G: Generated. Deflt: Default from subsystem parameters.
From this table, we can see that to control the storage allocation with the benefit of database grouping, better use the first two forms of the CREATE TABLE statements, which is probably your normal cases. To avoid concurrency issues related to DBD (Database Descriptor), a database should not include too many objects too. Also note that the XML indexes are associated with the base table in the catalog, even though they are associated with the XML table internally (reflected in OBD - Object Descriptor).
DB2-created XML objects is somewhat a departure from the previous practices. The main reason is for convenience of ISV packaged applications, and a direction for DB2 DDL to focus more on tables, less on table spaces and stogroups. I hope that this new convention does not introduce too much inconvenience for you, once you understand the relationships. Share your experiences with us.
In one of my previous postings, I mentioned that there is no special setup needed to get started with XML in DB2 9 for z/OS. However, if you or your folks are getting more serious, some installation and serviceability parameters (zparms) related to XML may need to be adjusted. Here I will talk about four zparms related to the virtual storage limit: XMLVALA, XMLVALS, LOBVALA, and LOBVALS. I will also explain why you could insert and select XML data larger than the virtual storage limit.
The XMLVALA and XMLVALS are the virtual storage limit for XML processing for each thread and entire sub-system, respectively. The main goal is to prevent run-away processes from bringing down your DB2 system. The default for XMLVALA and XMLVALS is 200MB and 10GB, respectively. Here is the initial guideline for setting to different values.
For XMLVALA, use at least 2x the maximum XML document size if there is no XPath (such as in V8), and use at least 4x the maximum XML document size (either generated or stored) if there are any XPath queries, up to 2GB. The maximum limit is 2GB.
For XMLVALS, use the maximum possible that will not cause the DB2 system down when this amount is really used up by XML processing, or the maximum number of threads x 2GB (or XMLVALA), whichever is smaller, with the maximum limit 50GB.
LOBVALA and LOBVALS are relevant for bind-in and bind-out of XML data when XML data needs to be materialized as a Large Object (LOB) in memory. In general, set LOBVALA to the maximum XML document size inserted or selected, and set LOBVALS to the maximum number of threads x LOBVALA.
For optimization, DB2 uses streaming techniques whenever possible to reduce the storage consumption so that only a small amount of storage is used to handle very large documents. For background, the following table lists all the cases for materialization of XML data as in-memory LOBs for INSERT, UPDATE, and SELECT. The LOAD utility is the same as local INSERT, while UNLOAD is the same as local SELECT without FETCH CONTINUE, i.e. no materialization. Thank Li-mey Lee of our team for providing the details.
XML Data Materialization During Bind-in and Bind-out in DB2 9
|Local, CLI |
and JDBC T2
- FRV (File References): YES
- Others: NO
- FETCH CONTINUE: YES
- Others: NO
- XML only, length known: NO
- XML only, at least one length=-1: all YES
- Mix of XML & LOB: always YES
|Always YES||Always YES|
From the above table, you can see that DB2 does not materialize XML data for a local application during simple INSERT or SELECT without using FETCH CONTINUE, so you can handle XML documents exceeding the zparm (XMLVALA or LOBVALA or both) limit. For a JDBC T4 application, you may be able to insert a large document exceeding the zparm limit, but you won't be able to select it back without increasing the limit.
Share with me your experience on setting the "correct" values for these zparms.
The XMLTABLE() function can be said to be the most powerful and versatile function in SQL/XML. In one of the previous postings, I listed some highlights on using the XMLTABLE function. Here I give some simple examples, especially cascading XMLTABLE functions in an SQL query.
For illustration, assume that we have the following table and have inserted one document below, which is very typical in the real world with header and body in a form.
CREATE TABLE MYTABLE (XDOC XML);
INSERT INTO MYTABLE VALUES('<form> <formheader id="1234"> <head1>head1</head1> <head2>head2</head2> </formheader> <formdata> <data1 did="d1234"> <data11>data11</data11> <data12>data12</data12> </data1> <data2 did="d2345"> <data21>data21</data21> <data22>data22</data22> </data2> <data3 did="d3456"> <data31>data31</data31> <data32>data32</data32> <data33>data33</data33> </data3> </formdata></form>');
Assume that we want to present the data in this document as a table, with data of level 1 under "formdata" as the rows, combined with some data from the header. The key to the XMLTABLE function is that the row-expression identifies nodes for the rows. The data from the header will be repeated for each row. The first XMLTABLE query using direct navigational paths looks like the following:
SELECT TX.*FROM MYTABLE, XMLTABLE('/form/formdata/*' PASSING XDOC COLUMNS ID VARCHAR(5) PATH '../../formheader/@id', HEAD1 VARCHAR(5) PATH '../../formheader/head1', NAME VARCHAR(8) PATH 'fn:local-name()', DATA XML PATH '.') TX;
Here is the result:
ID HEAD1 NAME DATA1234 head1 data1 <data1 did="d1234"><data11>data11</data11><data12>data12</data12></data1>1234 head1 data2 <data2 did="d2345"><data21>data21</data21><data22>data22</data22></data2>1234 head1 data3 <data3 did="d3456"><data31>data31</data31><data32>data32</data32><data33>data33</data33></data3>
To avoid repeated evaluations of column expressions for the header data, the following XMLTABLE query avoids the parent axis using two cascading XMLTABLE functions:
SELECT TX1.ID, TX1.HEAD1, TX.*FROM MYTABLE, XMLTABLE('/form' PASSING XDOC COLUMNS ID VARCHAR(5) PATH 'formheader/@id', HEAD1 VARCHAR(5) PATH 'formheader/head1', FORMDATA XML PATH 'formdata') TX1, XMLTABLE('$fd/*' PASSING TX1.FORMDATA as "fd" COLUMNS NAME VARCHAR(8) PATH 'fn:local-name()', Data XML PATH '.') TX;
It results in the same result as above.
Now let's assume we want to lower the level for the rows to the level 2 data under "formdata". We only need to get to the next level for the row expression by appending "/*" to the row path expression. Note that since each element at the data level has a unique name, the queries here illustrate the legitimate use of "*" - the wild card name test.
SELECT TX1.ID, TX1.HEAD1, TX.*FROM MYTABLE, XMLTABLE('/form' PASSING XDOC COLUMNS ID VARCHAR(5) PATH 'formheader/@id', HEAD1 VARCHAR(5) PATH 'formheader/head1', FORMDATA XML PATH 'formdata') TX1, XMLTABLE('$fd/*/*' PASSING TX1.FORMDATA as "fd" COLUMNS NAME VARCHAR(8) PATH 'fn:local-name()', Data XML PATH '.') TX;
The result looks like the following:
ID HEAD1 NAME DATA1234 head1 data11 <data11>data11</data11>1234 head1 data12 <data12>data12</data12>1234 head1 data21 <data21>data21</data21>1234 head1 data22 <data22>data22</data22>1234 head1 data31 <data31>data31</data31>1234 head1 data32 <data32>data32</data32>1234 head1 data33 <data33>data33</data33>
What if we want to get "@did" from the level 1 data as a column of the rows? We can certainly add "../@did" as a column expression. But we can avoid the parent axis by introducing another XMLTABLE function. Can you try to write the query yourself?
-Guogen (Gene) Zhang (GGZ)[Read More
XML is a unique data type in SQL. Understanding its rules is important in getting it right in applications. Luckily, many of the rules are straight-forward. I will cover the following three aspects here:
- XML comparison rules;
- XML assignment rules;
- XML data encoding.
At the end, you will see the application interface for XML is really simple, except for truncation errors. You can use existing string host variable types for XML data or use the new XML host variables. The best practice is to use binary host variable types (such as VARBINARY, BLOB, BLOB_FILE, XML AS BLOB, or XML AS BLOB_FILE) and UTF-8 encoding for XML data in your embedded applications. Similar for Java interface
XML comparison rules
The SQL XML type cannot be compared with any other type, including the XML type itself, using SQL comparison operators. XML can only be tested using one of the following two predicates:
- xml-expr IS NULL, or xml-expr IS NOT NULL
- theXMLEXISTS predicate, you can use XPath operators within XMLEXISTS.
If you need to compare an XML atomic value extracted from an XML document using XPath with an SQL value, you need to use XMLCAST to cast to an SQL type. XMLCAST can be used to cast a value from XML to SQL, or from SQL to XML. Casting from XML to SQL, such as XMLCAST(XMLQUERY('/purchaseOrder/@PONum' PASSING X) as INTEGER), is its main usage, as casting from SQL to XML is done by DB2 implicitly in most cases, such as in parameter passing of XMLQUERY, XMLEXISTS, and XMLTABLE.
XML assignment rules
In DB2 9, XML is not compatible with any other type for assignment. For convenience, an XML column can be assigned a string by implicit XMLPARSE. In addition, the assignment rules for bind-in and bind-out of XML data are quite flexible. XML assignment rules should be considered with two directions: storage assignment and retrieval assignment, similar to string assignment. Here is a summary of the XML assignment rules:
- XML host variables are supported in many host languages (seeapplication programming language support for XML for details). The XML host variables can only be assigned to XML columns and only accept XML values from SQL. That is, XML host variable usage is very restrictive.
- Any host variable of string types (character or binary: CHAR, VARCHAR, GRAPHIC, VARBINARY, BLOB, etc.) can be assigned to XML columns, by implicit XMLPARSE. For example, INSERT INTO MYTABLE VALUES (1, :VARCHARhv), where the second column is of XML type.
- Any host variable of string types (character or binary, except LOB locators) can accept XML values, by implicit XMLSERIALIZE. For example, SELECT XMLELEMENT(NAME "test", 'a') INTO :VARBINhv FROM SYSIBM.SYSDUMMYU;
In other words, you can hold XML values using non-XML string host variables, including file reference variables. Therefore, you can get XML data in and out without using XML host variables in languages like COBOL, PL/I, C/C++, etc. For example, you can simply fetch a SELECT result containing an XML value into a VARCHAR host variable. You can use regular Java interface for XML data also.
One thing that XML is different from other string types is that truncation is serious so it will result in an error (SQLCODE -433), instead of a warning. To avoid truncation, use a large enough buffer, or use FETCH WITH CONTINUE.
You would need the explicit conversion, i.e. XMLPARSE(), from string to XML in cases where the XML data held in a non-XML host var from an application is not directly into an XML column, or the XML data are in a non-XML column. Here are some simple examples for illustration:
INSERT INTO MYTABLE (C1, C2)SELECT XT.*FROM XMLTABLE('row-expr' PASSING XMLPARSE(DOCUMENT CAST(? AS BLOB) ) COLUMNS C1 VARCHAR(10) PATH 'column-expr1', C2 XML PATH 'column-expr2') XT;
UPDATE MYTABLE SET XMLC = XMLPARSE(DOCUMENT VC1);
-- this one shows the value of XML host varINSERT INTO MYTABLE (C1, C2)SELECT XT.*FROM XMLTABLE('row-expr' PASSING :xmlhv COLUMNS C1 VARCHAR(10) PATH 'column-expr1', C2 XML PATH 'column-expr2') XT;
On the other hand, you would need conversion, explicit XMLSERIALIZE(), from XML to string (a LOB) in cases where an XML value is not to bind out to a host var. For example, to insert XML data into a BLOB column:
INSERT INTO MYTABLE2SELECT C1, XMLSERIALIZE(XMLC as BLOB(512K))FROM MYTABLE;
XML data encoding
There are two ways for DB2 to recognize XML data encoding:
- External encoding: if you hold XML data in a character string host var or column, the CCSID associated with the host var or column determines encoding of XML data. DB2 ignores encoding information contained in the XML data if it exists.
- Internal encoding: if you hold XML data in a binary string host var or column, there is no CCSID, and DB2 determines encoding based on the information (BOM - Byte Order Marker, or encoding declaration) contained within the XML data. By default, the encoding is UTF-8.
DB2 uses UTF-8 encoding to process XML data internally, and stores character data in UTF-8. It will convert other encodings into UTF-8 for processing. To avoid any encoding conversion overhead, try to use UTF-8 for XML data whenever possible. For more XML encoding details, see XML data encoding.
Again, if you have any questions, please post to the DB2 pureXML Forum, or contact us at db2zxml at us.ibm.com.
-Guogen (Gene) Zhang (GGZ)[Read More
XML Schema is a W3C recommendation to specify "schema"s for XML data, used to put constraints on otherwise extremely flexible XML data. The constraints include basic data types, structures, occurrences, uniqueness, and referential integrity, etc. DB2 9 for z/OS pureXML provides XML schema validation for XML data through a user-defined function (UDF) called DSN_XMLVALIDATE. The following provides some guidelines in using XML schema validation.
- An XML schema is not required for an XML column. An XML column can store any well-formed XML documents, and it contains untyped XML data, even if a document is validated using the DSN_XMLVALIDATE function. In other words, type annotations are not kept after schema validation.
- XML Schema validation can be performed in an application using any schema validation tools, or in DB2 by invoking DSN_XMLVALIDATE. To invoke schema validation in DB2, an XML schema has to be registered in DB2 XML Schema Repository (XSR). XSR consists of a set of DB2 tables storing XML schema info, and a set of stored procedures to manage the schemas. A special JDBC interface is also provided for schema registration.
- Setting up XSR requires Java V5, JDBC and Java stored procedure setup, which requires WLM setup. See the "XSR setup and troubleshooting" article in the DB2 for z/OS pureXML wiki page for references to the setup information and troubleshooting details.
- XSR tables and stored procedures are created during the installation or migration. Please customize XSR related settings by changing the installation and migration jobs DSNTIJSG and DSNTIJNX.
- XML Schema documents (XSDs) have to be provided physically to DB2 for schema registration. During registration, an SQL ID in the form of "SYSXSR.<name>" can be provided to reference the schema. The schema can also be referenced by the target namespace with optional schema location. Schema documents are compiled into a binary format for runtime use. The original schema documents and the binary representations generated at XSR_COMPLETE are stored in XSR table SYSIBM.XSROBJECTS and other XSR tables. Note that string IDs are used for XML target namespaces and schema locations in the tables.
- To validate a document against a schema, invoke DSN_XMLVALIDATE in one of the two forms: two-argument form and three-argument form. The two-argument form takes an instance document and the schema name (SQL ID), while the three-argument form takes an instance document and schema target namespace, with optional schema location. As of this writing, DSN_XMLVALIDATE function has to be wrapped in the XMLPARSE function, and has a 50MB size limit on a document. For example, the following INSERT statement inserts an XML document after validation against a schema named SYSXSR.PURCHASEORDER:
INSERT INTO MYTABLE VALUES(10, XMLPARSE(DOCUMENT DSN_XMLVALIDATE(xmllob, 'SYSXSR.PURCHASEORDER') ) );Validation may cause XML data to change if there are default values and normalization specified in the schema that are applicable to the instance document.
- DB2 invokes XLXP-C high-performance validating parser. However, validation is still 2 to 3 times as expensive as simple parsing in terms of CPU time. Avoid validation in DB2 to save CPU and improve performance if performance is a concern.
- To support XML schema versioning, either use a totally different target namespace for a new schema version (usually for major changes), or use the same target namespace with a different schema location (usually for minor changes). You have to register schema versions with different SQL names in the XSR. Usually target namespace and schema location are also used in instance documents to identify the schema version. DB2 does not have the capability to use the target namespace and schema location to identify a registered schema for validation yet.
- Use Command Line Processor (CLP) or IBM Data Studio to register and manage XML schemas. Either way invokes the underlying XSR stored procedures, and is more convenient than using the native XSR stored procedures. See XML QuickDemo for a simple example of using CLP to register a schema.
Contact us if you have any questions related to XML schema validation.
-Guogen (Gene) Zhang (GGZ)[Read More]
One of the key differences of XML from the Large Object (LOB) types in DB2 is that XML data can be indexed. The XML indexes supported in DB2 9 are also called XML value indexes, and queries on XML can use these indexes for performance.
An XML index example
An XML index is used to provide a mapping from a node value to its location. An XPath expression is required to specify the nodes within a document to be indexed. Two data types are supported for XML indexes in DB2 9 for z/OS: DECFLOAT for numeric values and VARCHAR(n) for string values. For example, the following CREATE INDEX DDL creates an XML index on TransRefGUID of the REQUESTXML column of ACORD data, as VARCHAR(24):
CREATE INDEX ACORD.ACORDINDEX1 ON ACORD.REQUEST(REQUESTXML)GENERATE KEYS USING XMLPATTERN'declare default element namespace "http://ACORD.org/Standards/Life/2"; /TXLife/TXLifeRequest/TransRefGUID' as SQL VARCHAR(24)
DB2 will take the value of /TXLife/TXLifeRequest/TransRefGUID as the key, and map to its logical location (DOCID, NODEID), and physical location (RID), using the existing B+-tree index infrastructure.This index can be used for queries that search on TransRefGUID. For example, the following XMLEXISTS predicate can potentially use this index:
XMLExists('declare default element namespace "http://ACORD.org/Standards/Life/2"; /TXLife/TXLifeRequest[TransRefGUID="2004-1217-141016-000012"]'PASSING REQUEST.REQUESTXML)
From queries, it is easy to figure out what kind of XML indexes can speed up the queries - by concatenating path steps from within a predicate to the steps outside the predicate, and using data types consistent between the queries and indexes.
Before explaining the index access procedure, I'd like to review the basic XML storage scheme so you can understand it better.
Basic XML storage scheme in DB2 9 for z/OS
The following picture depicts the high-level storage scheme for XML data.
At high-level, the XML data is stored in a separate table space, just like LOB data. The real XML data is stored in the XMLDATA column of the internal XML table. It contains the hierarchical data in records that can fit in 16KB pages. In order to support free movement of data records, logical links using NODEIDs are used. That's why we need a NODEID index to link records for a document. Similarly, since utilities, such as REORG, can be applied to the base table space and the XML table space independently, XML indexes do not contain base table RIDs, but XML table RIDs. In order to get to base table rows from XML indexes, we need the DOCID index on the base table (see below). That's why DB2 always creates a DOCID index on the base table and a NODEID index on the XML table as part of the storage scheme, although they are for totally different purposes. By XML indexes, we refer to XML value indexes created by users.
Basic XML index access plans
If you use EXPLAIN for a query and select some key columns from the PLAN_TABLE, you will see some new access type in the ACCESSTYPE column for SQL/XML queries. They are the following:
- DX: DOCID list access. DB2 searches an XML index and retrieves all the qualified DOCIDs. Then the DOCIDs are sorted and duplicates are removed. The DOCID list is further converted to a RID list of the base table through the DOCID index. Then the base table rows are fetched.
- DI: DOCID ANDing. DB2 intersects two DOCID lists from two XML indexes for conjunction of two XPath predicates.
- DU: DOCID ORing. DB2 unions two DOCID lists from two XML indexes for disjunction of two XPath predicates.
- M: Multi-index access.
If you see "R" (R-Scan) in the ACCESSTYPE for a table with an XMLEXISTS predicate, then DOCSCAN is applied for the XML column. No new type was introduced for the scan.
Here is an example for an index ANDing plan:
+---------------------------------------------------------------------+ | PLANNO | ACCESSTYPE | MATCHCOLS | ACCESSNAME | MIXOPSEQ | +---------------------------------------------------------------------+1_| 1 | M | 0 | | 0 |2_| 1 | DX | 1 | ACORDINDEX2 | 1 |3_| 1 | DX | 1 | ACORDINDEX1 | 2 |4_| 1 | DI | 0 | | 3 | +---------------------------------------------------------------------+
The following diagram illustrate the process of using the index ANDing plan. Step 4 is where a DOCID index is always used.
These are the basic XML index access plans. For example, NODEID and RID from XML indexes are not used for queries today. We are enhancing plans for better query performance, so expect more methods in the future.
Somethings specific to XML indexes
The same principles for relational indexes apply to XML indexes, such as create indexes only needed by queries and use REBUILD INDEX. The following are some unique features of XML indexes:
- Depending on the XPath in the XMLPATTERN of the CREATE INDEX statement and XML documents, DB2 may generate zero, one or more key entries for a document in an XML index.
- For a numeric index (DECFLOAT type), if a node identified by the XMLPATTERN contains a value that cannot be cast to a number, it is ignored. But insert will be successful. For example, no index entry is generated for XL if the index is DECFLOAT type.
- However, for a string index (VARCHAR(n) type), if a node identified by the XMLPATTERN contains a value that is longer than n, then the document will be rejected, or CREATE INDEX will fail. The maximum n is 1000.
- You can create as many indexes as you like on a single XML column. You could even create multiple indexes with different types on the same XMLPATTERN. This provides much flexibility for XML.
- Use fully specified XPath whenever possible to reduce the cost of maintaining the indexes.
- Index leaf nodes or nodes close to leaves. For non-leaf nodes, the key values are concatenated values from descendants, and their usage is a bit weird. For example, index on name for <name><firstname>John<firstname><lastname>Doe<lastname><name> will result in "JohnDoe" as the key, and you have to use [name="JohnDoe"] in queries to qualify for the index-based search. In addition, DB2 9 does not support a node across record boundaries.
To produce consistent query results with or without XML indexes, DB2 tries to tolerate cast errors during XMLEXISTS predicate numeric comparison. For example, if a node size contains "XL", comparison [size > 10] will tolerate the "XL" value, which is equivalent to evaluate to false.
DECFLOAT is used instead of DOUBLE for the numeric index type due to its precision. Date and time are not yet supported, but you can use string indexes if you use ISO format (which is required in XML) without timezone or always use the same timezone for the data, and use string comparison in the queries to search the documents.
If you never search inside XML documents, but get XML data in and out as a whole, you probably don't need to use the XML type, since VARCHAR or VARBINARY or LOB types can serve the purpose.
To summarize, XML indexes use XPath to identify nodes to be indexed, and can be used for queries with XMLEXISTS and XMLTABLE predicates.
-Guogen (Gene) Zhang (GGZ)[Read More
We all know the benefits of Web services and Service-Oriented Architecture (SOA). One of the benefits is to revitalize and enable existing applications or functionality as components to participate in new applications, extending the life and value of "legacy" applications. To make DB2 applications a natural part of this architecture, DB2 provides two SOAPHTTP functions as UDFs - SOAPHTTPNV and SOAPHTTPNC (replacing SOAPHTTPV and SOAPHTTPC) to invoke web services. With SQL XML functions that handle XML data, an SQL/XML statement can consume XML responses from SOAP Web services directly.
An SQL/XML Example
Here is an example to get stock quote from a free public web service and to return a row:
SELECT *FROM XMLTABLE('/StockQuotes/Stock' PASSING XMLPARSE(DOCUMENT XMLCAST( XMLQUERY('declare namespace soap="http://schemas.xmlsoap.org/soap/envelope/"; declare default element namespace "http://www.webserviceX.NET/";/soap:Envelope/soap:Body/GetQuoteResponse/GetQuoteResult'PASSING XMLPARSE(DOCUMENTDB2XML.SOAPHTTPNC( 'http://www.webservicex.net/stockquote.asmx', 'http://www.webserviceX.NET/GetQuote', '<?xml version="1.0" encoding="utf-8"?><soap:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"> <soap:Body> <GetQuote xmlns="http://www.webserviceX.NET/"> <symbol>IBM</symbol> </GetQuote> </soap:Body></soap:Envelope>') ) ) as VARCHAR(2000)) ) COLUMNS "Symbol" VARCHAR(4), "Last" DECIMAL(6,2), "Date" VARCHAR(10), "Time" VARCHAR(8), "Change" VARCHAR(8), "Open" VARCHAR(8), "High" VARCHAR(8), "Low" VARCHAR(8), "Volume" VARCHAR(12) ) XT#
It returns a row like the following:
Symbol Last Date Time Change Open High Low VolumeIBM 84.70 1/9/2009 4:00pm -2.48 87.05 87.50 84.25 7440555 1 record(s) selected
You can combine the result with any complex joins to other data you can imagine without application programming in lower-level languages to consume the web services. It not only provides convenience with improved productivity, but also better performance in consuming web services, especially when you need to combine web service result with other DB2 data.
Setup of the SOAPHTTP functions
The two SOAP web service consumer UDFs SOAPHTTPNV and SOAPHTTPNC were provided through APAR PK48773, which had an APAR PK73324 PTF UK42369 for DB2 9. Also there is an APAR PK70932 PTF UK40600 for SQL XMLTABLE function. Apply the PTFs and then use the SQL DDLs like the following to create the UDFs with your WLM environment (WLMENV8 used below) and timeout limit (10 seconds below). For the complete four CREATE statements, see this file (CreateAlterSOAPUDFs.txt).
CREATE FUNCTION DB2XML.SOAPHTTPNV ( ENDPOINT_URL VARCHAR(256), SOAP_ACTION VARCHAR(256), SOAP_INPUT VARCHAR(32672)) RETURNS VARCHAR(32672)LANGUAGE CSPECIFIC SOAPHTTPNVIVOEXTERNAL NAME DSNWSCVVPARAMETER STYLE DB2SQLPARAMETER CCSID UNICODEPARAMETER VARCHAR STRUCTURESCRATCHPADFINAL CALLFENCEDNOT DETERMINISTICCALLED ON NULL INPUTNO SQLEXTERNAL ACTION RUN OPTIONS 'POSIX(ON),XPLINK(ON),ENVAR("DB2SOAP_TIMEOUT=10")'WLM ENVIRONMENT WLMENV8DBINFO;
After creation, if you need to modify the timeout limit, you can execute an ALTER statement like the following to increase the timeout to 20 seconds, for example:
ALTER SPECIFIC FUNCTION DB2XML.SOAPHTTPNVIVO RUN OPTIONS 'POSIX(ON),XPLINK(ON),ENVAR("DB2SOAP_TIMEOUT=20")';
See the same file for the other ALTER DDL examples.
Revisiting the Example
If you look at the above example carefully, you may wonder why it has two levels of XMLPARSE and processing using XMLQUERY and XMLTABLE. Let's examine the response example:
<?xml version="1.0" encoding="utf-8"?><soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><soap:Body><GetQuoteResponse xmlns="http://www.webserviceX.NET/"><GetQuoteResult><StockQuotes><Stock><Symbol>IBM</Symbol><Last>84.70</Last><Date>1/9/2009</Date><Time>4:00pm</Time><Change>-2.48</Change><Open>87.05</Open><High>87.50</High><Low>84.25</Low><Volume>7440555</Volume><MktCap>113.8B</MktCap><PreviousClose>87.18</PreviousClose><PercentageChange>-2.84%</PercentageChange><AnnRange>69.50 - 130.93</AnnRange><Earns>8.483</Earns><P-E>10.28</P-E><Name>INTL BUSINESS MAC</Name></Stock></StockQuotes></GetQuoteResult></GetQuoteResponse></soap:Body></soap:Envelope>
If you look closely, the soap:Body contains some XML data as text. The GetQuoteResult element contains many escaped entity references. In order to make sense of it, we need to cast it to a string, instead of serializing it, which is achieved by XMLCAST( . as VARCHAR(2000)). After casting, the content looks like the following, the plain XML data:
<StockQuotes> <Stock> <Symbol>IBM</Symbol> <Last>84.70</Last> <Date>1/9/2009</Date><Time>4:00pm</Time> <Change>-2.48</Change> <Open>87.05</Open> <High>87.50</High> <Low>84.25</Low> <Volume>7440555</Volume> <MktCap>113.8B</MktCap> <PreviousClose>87.18</PreviousClose> <PercentageChange>-2.84%</PercentageChange> <AnnRange>69.50 - 130.93</AnnRange> <Earns>8.483</Earns> <P-E>10.28</P-E> <Name>INTL BUSINESS MAC</Name> </Stock></StockQuotes>
Once you reveal the true value, you can perform XMLPARSE and XMLTABLE again, as shown in the above query example, to extract the row out of this data. This example also illustrates the difference between XMLSERIALIZE and XMLCAST.
The above example shows how easy you can invoke and consume web services in the same SQL statement. To see how to make DB2 SQL statements, stored procedures and applications web service providers, look into IBM Data Studio tooling for Data Web Services. You will be surprised how easy they are. Hope you will like the features.
Today is Chinese New Year's day for the year of ox. I wish you a Happy Niu Year! 恭喜发财!
-Guogen (Gene) Zhang (GGZ)[Read More
To make it easier for DB2 for z/OS friends to find pureXML information specific to DB2 for z/OS, we've just created a DB2 for z/OS pureXML wiki page under DB2 XML wiki page to list z/OS-specific information. It may contain cross-references to other areas, and has initial content right now. We expect the content to accumulate over the time.
In the new wiki page, I've just uploaded DB2 9 for z/OS pureXML podcast series, recorded by Guogen Zhang (that's me :-) ) in 2008. This is a 10-part introductory podcast series on pureXML business value and technical knowledge. The first part covers overview and business value, and the follow-on 8 parts cover different aspects of pureXML, from query, schema, to utilities and performance. The last part talks about best practices. Each part is about 10-15 minutes. The podcast is in mp3 audio format together with transcripts in PDF.
Give us feedback on whether this introductory podcast series is useful to you. If so, we will prepare follow-on podcast for more advanced topics of your interests. Thank you.
Check back the DB2 for z/OS pureXML wiki page often. You never know what will pop up there.
-Guogen (Gene) Zhang[Read More
The XMLTABLE() function is probably the most powerful function in manipulating XML data using SQL/XML. Some of its uses include, but not limited to, the following:
- to achieve XQuery capability using SQL/XML with XPath. See SQL/XML queries for XQuery usecases.
- to decompose a large document into smaller documents for storage.
- to generate relational view over stored XML data, to bridge the gap between XML storage and relational application software, and also provide reporting over XML data.
Since there are other articles giving detailed account of XMLTABLE function, for example, Matthias's two-part articles of "XMLTABLE By Examples" Part 1 and Part 2, I will just list the highlight on using the XMLTABLE function here:
- XMLTABLE and XMLCAST were delivered post V9 GA. Apply PK51571, PK51572, and PK51573 before trying the functions.
- In general, XMLTABLE() returns multiple rows of multiple columns. The column types can be non-XML SQL types and XML type.
- The column value of non-XML type has to contain at most one item only. Otherwise, an error sqlcode : -16003 sqlstate: 10507 will be returned. In this case, you need to lower the level of nodes the row expression identifies, and use "." (self) in the column path expression if necessary.
- Use XMLNAMESPACES to declare namespaces that are needed across the row path expression and column path expressions, like this: XMLTABLE(XMLNAMESPACES(), ...). The namespaces declared in the row path expression do not cover column path expressions (according to the SQL standard).
- Avoid using the parent axis ("..") due to its overhead. If you need to lower the level, try to cascade XMLTABLE functions. The first XMLTABLE() produces rows at a high level so that the parent axis can be avoided, and the next XMLTABLE function takes as the input the XML column output of the previous XMLTABLE, and generates other columns.
- If you use XMLTABLE to decompose a document into smaller documents, you need to use XMLDOCUMENT() constructor to wrap the pieces into a document to insert into an XML column.
- You can use XMLTABLE() to extract one row (a tuple) of multiple columns, in particular, multiple heterogeneous pieces from different portions of a document, to be used in constructing a new document.
- You can use XMLTABLE() to iterate through a sequence of homogeneous items that XMLQUERY was not able to separate them, each to be used in constructing a new document. XMLAGG() can be used to aggregate multiple parts back into one value.
- You can use XMLTABLE() to define a relational view so SQL queries can treat XML data as relational data, to bridge the gap between XML and relational. This can also achieve more than XQuery 1.0 can do for reporting and light analytics. For heavy analytics, you may need to materialize XMLTABLE result into tables. MQT support for XMLTABLE() is not available at this point.
- You can use XMLTABLE() and XMLPARSE() to consume Web services directly in SQL by using SOAPHTTPNV or SOAPHTTPNC UDFs in DB2 (more to come).
If you have any questions, don't hesitate to contact me or post questions at DB2 pureXML forum.
-Guogen (Gene) Zhang (GGZ)
In this first posting of the new year, I'm going to share something non-technical. We are looking forward to the uptake of the pureXML projects in many of our clients and ISVs. As for any new technology, pureXML adoption takes leadership and courage. The reward will be ample, especially in this unprecedented, tough global economy. With pureXML, you could support new products and services in reduced time to market, achieve more with the same resources, and improve the bottom-line of the businesses.
Having read the book "Primal Leadership - learning to lead with emotional intelligence" by Daniel Goleman, et al. before, I recently came across a series of three Harvard Business Review (HBR) articles that are pretty much short versions of the book. "Leadership that gets results" focuses on six leadership styles.
The resonant styles are:
- the visionary/authoritative styles,
- the coaching style,
- the affiliative style, and
- the democratic style.
And the dissonant styles are:
- the pacesetting style and
- the command/coercive style.
"The best leaders don’t know just one style of leadership — they’re skilled at several, and have the flexibility to switch between styles as the circumstances dictate."
"What makes a leader" focuses on the emotional intelligence that is necessary to be effective leaders. The four emotional intelligence domains/skills are:
- social awareness, and
- relationship management(empathy and motivation).
"Primal leadership - the hidden driver of great performance" focuses on a process of five steps to achieve greater emotional intelligence:
- What do you want to be?
- Who are you now?
- How do you get from here to there?
- How do you make change stick?
- Who can help you (in the above process)?
If you are not familiar with this research result, I highly recommend you search on the web to find some summary in more detail, or get the book or articles.
Another book that I read over ten years ago is also one of the best-sellers many know well: "The seven habits of highly effective people" by Stephen Covey. The seven habits are:
- be proactive;
- begin with the end in mind;
- put first things first;
- think win/win;
- seek first to understand, then to be understood;
- sharpen the saw.
"The key to success in business is action. But in most companies, people are rewarded for talking - and the longer, louder, and more confusingly, the better." (from HBR article "The smart-talk trap") Talking is probably better than deceiving and bad politics. And trustworthiness and integrity are essential in a leader's character. (I don't know why I'm writing these sentences.)
I wish you fulfill your new year resolution! And make pureXML one of your technology priorities this year.
- Guogen (Gene) Zhang (GGZ)[Read More