In IOD Global Conference 2008 at Las Vegas held at the end of October, there were some discussions on what computing era we are in. The computing era means the (new) dominant computing architecture(s) at the time. We probably had the mainframe era, PC era, the client-server era, the networking/internet era, and I thought currently we are in the cloud computing era!
The question was really related to the new major business application usage. There were days of batch data processing, OLTP, office automation, MIS, decision support systems/OLAP, business performance optimization, etc. There was no simple phrase to describe what business applications are used for today - it's going up in the food chain.
I'm more interested in a database era here today. Looks to me we have entered the era of XML databases in the database history. The first 15-20 years of database history was the era of hierarchical and network databases, when IMS was one of the most popular databases. The following 20 years is the era of relational databases, with quite a few popular products, including DB2. Then there was a period of time that object-oriented databases attempted to enter the mainstream, but never succeeded for whatever reasons (well - OODBs are really mainly for CAD type of applications).
Given the customer demand and real value of XML databases to the business for its flexibility and productivity uplifting, the era of XML databases is real, and it will last probably 20 years as a major database technology development and market growth. IMS and relational databases are not going away. In fact, they continue to grow. What I call the era is due to the heavy investment in the technology development, and major adoption and market growth.
To me, this year marked the 1st anniversary of the beginning of the XML database era as DB2 9 for z/OS with native XML support was GAed last year. Many vendors and customers share the same view. Last year, we had ISVs and cutting edge customers telling us they felt that XML was going to be big and looked same as the introduction of relational databases. This year, more customers said to me that they better learn more about XML and prepare for the inevitable adoption of DB2 XML database technology.
Do you share the same view? If you have doubt, do some investigation, or wait for history to tell if it's true or not. But I'm a total believer!
-Guogen (Gene) Zhang (GGZ)[Read More
We all know the benefits of Web services and Service-Oriented Architecture (SOA). One of the benefits is to revitalize and enable existing applications or functionality as components to participate in new applications, extending the life and value of "legacy" applications. To make DB2 applications a natural part of this architecture, DB2 provides two SOAPHTTP functions as UDFs - SOAPHTTPNV and SOAPHTTPNC (replacing SOAPHTTPV and SOAPHTTPC) to invoke web services. With SQL XML functions that handle XML data, an SQL/XML statement can consume XML responses from SOAP Web services directly.
An SQL/XML Example
Here is an example to get stock quote from a free public web service and to return a row:
SELECT *FROM XMLTABLE('/StockQuotes/Stock' PASSING XMLPARSE(DOCUMENT XMLCAST( XMLQUERY('declare namespace soap="http://schemas.xmlsoap.org/soap/envelope/"; declare default element namespace "http://www.webserviceX.NET/";/soap:Envelope/soap:Body/GetQuoteResponse/GetQuoteResult'PASSING XMLPARSE(DOCUMENTDB2XML.SOAPHTTPNC( 'http://www.webservicex.net/stockquote.asmx', 'http://www.webserviceX.NET/GetQuote', '<?xml version="1.0" encoding="utf-8"?><soap:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"> <soap:Body> <GetQuote xmlns="http://www.webserviceX.NET/"> <symbol>IBM</symbol> </GetQuote> </soap:Body></soap:Envelope>') ) ) as VARCHAR(2000)) ) COLUMNS "Symbol" VARCHAR(4), "Last" DECIMAL(6,2), "Date" VARCHAR(10), "Time" VARCHAR(8), "Change" VARCHAR(8), "Open" VARCHAR(8), "High" VARCHAR(8), "Low" VARCHAR(8), "Volume" VARCHAR(12) ) XT#
It returns a row like the following:
Symbol Last Date Time Change Open High Low VolumeIBM 84.70 1/9/2009 4:00pm -2.48 87.05 87.50 84.25 7440555 1 record(s) selected
You can combine the result with any complex joins to other data you can imagine without application programming in lower-level languages to consume the web services. It not only provides convenience with improved productivity, but also better performance in consuming web services, especially when you need to combine web service result with other DB2 data.
Setup of the SOAPHTTP functions
The two SOAP web service consumer UDFs SOAPHTTPNV and SOAPHTTPNC were provided through APAR PK48773, which had an APAR PK73324 PTF UK42369 for DB2 9. Also there is an APAR PK70932 PTF UK40600 for SQL XMLTABLE function. Apply the PTFs and then use the SQL DDLs like the following to create the UDFs with your WLM environment (WLMENV8 used below) and timeout limit (10 seconds below). For the complete four CREATE statements, see this file (CreateAlterSOAPUDFs.txt).
CREATE FUNCTION DB2XML.SOAPHTTPNV ( ENDPOINT_URL VARCHAR(256), SOAP_ACTION VARCHAR(256), SOAP_INPUT VARCHAR(32672)) RETURNS VARCHAR(32672)LANGUAGE CSPECIFIC SOAPHTTPNVIVOEXTERNAL NAME DSNWSCVVPARAMETER STYLE DB2SQLPARAMETER CCSID UNICODEPARAMETER VARCHAR STRUCTURESCRATCHPADFINAL CALLFENCEDNOT DETERMINISTICCALLED ON NULL INPUTNO SQLEXTERNAL ACTION RUN OPTIONS 'POSIX(ON),XPLINK(ON),ENVAR("DB2SOAP_TIMEOUT=10")'WLM ENVIRONMENT WLMENV8DBINFO;
After creation, if you need to modify the timeout limit, you can execute an ALTER statement like the following to increase the timeout to 20 seconds, for example:
ALTER SPECIFIC FUNCTION DB2XML.SOAPHTTPNVIVO RUN OPTIONS 'POSIX(ON),XPLINK(ON),ENVAR("DB2SOAP_TIMEOUT=20")';
See the same file for the other ALTER DDL examples.
Revisiting the Example
If you look at the above example carefully, you may wonder why it has two levels of XMLPARSE and processing using XMLQUERY and XMLTABLE. Let's examine the response example:
<?xml version="1.0" encoding="utf-8"?><soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><soap:Body><GetQuoteResponse xmlns="http://www.webserviceX.NET/"><GetQuoteResult><StockQuotes><Stock><Symbol>IBM</Symbol><Last>84.70</Last><Date>1/9/2009</Date><Time>4:00pm</Time><Change>-2.48</Change><Open>87.05</Open><High>87.50</High><Low>84.25</Low><Volume>7440555</Volume><MktCap>113.8B</MktCap><PreviousClose>87.18</PreviousClose><PercentageChange>-2.84%</PercentageChange><AnnRange>69.50 - 130.93</AnnRange><Earns>8.483</Earns><P-E>10.28</P-E><Name>INTL BUSINESS MAC</Name></Stock></StockQuotes></GetQuoteResult></GetQuoteResponse></soap:Body></soap:Envelope>
If you look closely, the soap:Body contains some XML data as text. The GetQuoteResult element contains many escaped entity references. In order to make sense of it, we need to cast it to a string, instead of serializing it, which is achieved by XMLCAST( . as VARCHAR(2000)). After casting, the content looks like the following, the plain XML data:
<StockQuotes> <Stock> <Symbol>IBM</Symbol> <Last>84.70</Last> <Date>1/9/2009</Date><Time>4:00pm</Time> <Change>-2.48</Change> <Open>87.05</Open> <High>87.50</High> <Low>84.25</Low> <Volume>7440555</Volume> <MktCap>113.8B</MktCap> <PreviousClose>87.18</PreviousClose> <PercentageChange>-2.84%</PercentageChange> <AnnRange>69.50 - 130.93</AnnRange> <Earns>8.483</Earns> <P-E>10.28</P-E> <Name>INTL BUSINESS MAC</Name> </Stock></StockQuotes>
Once you reveal the true value, you can perform XMLPARSE and XMLTABLE again, as shown in the above query example, to extract the row out of this data. This example also illustrates the difference between XMLSERIALIZE and XMLCAST.
The above example shows how easy you can invoke and consume web services in the same SQL statement. To see how to make DB2 SQL statements, stored procedures and applications web service providers, look into IBM Data Studio tooling for Data Web Services. You will be surprised how easy they are. Hope you will like the features.
Today is Chinese New Year's day for the year of ox. I wish you a Happy Niu Year! 恭喜发财!
-Guogen (Gene) Zhang (GGZ)[Read More
DB2 for z/OS clients are very CPU-conscious. So are we the developers in DB2, and performance has been one of the most critical factors during our design and development for XML support. We probably have one of the most efficient XML parsers implemented in software in the world, co-designed and custom-made by our z/OS friends; and we have a high-performance XML schema validating parser also. As a result, DB2 for z/OS has very efficient insert, load, and update. (Deletion and retrieval of XML documents perform very well too but do not need XML parsing).
To reduce the CPU cost, DB2 workloads from DRDA can benefit from zIIP specialty engines, and XML parsing can now be redirected to zIIP or zAAP. I hope to provide some details here.
As part of DB2 workload from DRDA, XML processing is eligible for zIIP redirection, except for XML validation (that is a UDF not eligible for zIIP for the time being). In addition, recent enhancements made the z/OS XML parsing system service 100% eligible for redirection to zIIP or zAAP, depending on the mode under which it is running:
- For the z/OS XML parsing system service invoked from the TCB mode, it's 100% eligible for zAAP redirection.
- And for the z/OS XML parsing system service invoked from the enclave SRB mode, it's 100% eligible for zIIP redirection.
Before redirection of the z/OS XML parsing system service, DB2 9 XML workload (including XML parsing system service) from DRDA via TCPIP in the following scenarios run under the SRB mode is zIIP eligible:
- WebSphere for z/OS (JCC T4)
- WebSphere distributed (JCC T4)
- Distributed transactions using DB2 Connect
- Native SQL Stored Procedure
In addition, Index Build phase of LOAD is specially executed under SRB mode to make it eligible for zIIP also (applicable to XML indexes, but no XML parsing involved).
Now with the redirection of z/OS XML parsing system service, the XML parsing invoked for the above scenarios may qualify for additional redirection at 100% rate. For example, a workload that used to have 49.6% redirected to zIIP can now have 56.60% redirected to zIIP.
For DB2 9 XML workload in the following scenarios not from DRDA run under the TCB mode, z/OS XML parsing system service is 100% eligible for zAAP redirection:
- WebSphere® for z/OS (JCC T2)
- Stored Procedure/UDF/Trigger
- Native SQL PL stored procedure
- Call Attach (CAF)
- RRS Attach
- LOAD utility
In addition, XML parsing system service invoked by a stored procedure/UDF/trigger from DRDA via TCPIP run under the TCB mode is now also 100% eligible for zAAP. For example, we've observed up to 36% or even 48% of XML LOAD CPU redirected to zAAP. This is the first time that part of a DB2 workload can be redirected to a zAAP (through the system services).
DB2 9 APAR PK50575 adds support for accounting zAAP redirection in addition to that already provided for zIIP, and records CPU time on zAAP and zIIP engines when they exist.
In order to estimate redirection, RMF workload activity report will show specialty engine eligible CPU through APPL% AAPCP and APPL% IIPCP, when setting PROJECTCPU = YES in IEAOPTxx member in SYS1.PARMLIB.
Relevant z/OS levels and APARs to enable XML parsing system services 100% eligible for specialty engines are the following:
- zAAP - z/OS V1.9, APAR OA20308 for z/OS V1.8, V1.7.
- zIIP - z/OS V1.10, APAR OA23828 for z/OS V1.8, V1.9.
For more details about zIIP redirection and some XML performance numbers, see this whitepaper: DB2 9 and z/OS XML System Services Synergy Update.
At the moment, DB2 9 for z/OS XML schema validation does not invoke z/OS system services available in z/OS V1.10. As a future direction, when DB2 XML validation is enhanced to invoke the system services, it will be 100% eligible for zIIP and zAAP redirection also.
In summary, on z/OS, XML parsing is extremely efficient, and can be redirected 100% to zIIP or zAAP. DB2 XML processing takes advantage of the redirection to lower CPU cost.
To make it easier for DB2 for z/OS friends to find pureXML information specific to DB2 for z/OS, we've just created a DB2 for z/OS pureXML wiki page under DB2 XML wiki page to list z/OS-specific information. It may contain cross-references to other areas, and has initial content right now. We expect the content to accumulate over the time.
In the new wiki page, I've just uploaded DB2 9 for z/OS pureXML podcast series, recorded by Guogen Zhang (that's me :-) ) in 2008. This is a 10-part introductory podcast series on pureXML business value and technical knowledge. The first part covers overview and business value, and the follow-on 8 parts cover different aspects of pureXML, from query, schema, to utilities and performance. The last part talks about best practices. Each part is about 10-15 minutes. The podcast is in mp3 audio format together with transcripts in PDF.
Give us feedback on whether this introductory podcast series is useful to you. If so, we will prepare follow-on podcast for more advanced topics of your interests. Thank you.
Check back the DB2 for z/OS pureXML wiki page often. You never know what will pop up there.
-Guogen (Gene) Zhang[Read More
Since DB2 9 for z/OS beta time in early 2006, I have been frequently asked by customers the question of the value of pureXML in DB2. Although more and more folks now realize the value of XML support in DB2, I'd like to touch upon this topic here.
XML is ubiquitous and is being used to represent all kinds of data in industries, government, and academics etc. If you are an application developer in the enterprise environment dealing with vastly increasing XML data asset and ever-changing business demand and are dissatisfied with the limitations of traditional relational databases, DB2 9 for z/OS provides XML management capability as well as relational data management. Unlike XML-enabled databases and other database offerings that rely on Large Object storage or transform XML into relational data, the pureXML technology is specifically designed and optimized for XML hierarchically structured data, and provides proven enterprise-class reliability, availability, scalability, performance and security for XML data that you have come to expect from DB2 for z/OS.
Since the XML data model is hierarchical, the same as IMS. Quite often, people ask what the differences are between XML and IMS. I think there are two: XML data is very flexible while IMS data structures are rigid; and XML has high-level query languages while IMS has procedural DL/I as its native language (although our IMS friends provided SQL and XQuery interface). SQL/XML with XPath provided in DB2 9 for z/OS makes application processing XML much easier.
We've seen the following XML usage scenarios in DB2, with more and more interesting application scenarios popping up. So this list won't be complete. Let us know if you have interesting scenarios to share.
- The first and most direct case is to process XML data, including industry standard XML format (such as FIXML, FpML, ACORD, UNIFI, MISMO etc.), forms and reports (such as XBRL), you can store it, and retrieve it, just like relational data. If you use COBOL or PL/I, to make up the weak XML processing capability in these languages, you can have applications to invoke SQL to process XML data. If you need to connect to the existing back-end systems, you can use the XMLTABLE function to convert XML data into relational views, while you can develop new applications on the XML data.
- Second, pureXML can help develop applications that handle versatile schemas that change frequently, and also help develop end-user customizable applications, which is particularly important for ISVs and IT service departments in large enterprises, where you frequently need to adjust the applications for end-users for diverse information, such as product specification and customer information.
- The third application scenario is to process sparse attribute values, such as medical records, or forms, where there are many fields overall, but only a few of them are applicable for each case. If you use relational approach, you would have to use a table very wide, with many many columns, but most of the column data are null. And you can use one XML column to handle that because XML can hold as many items as you want in a single document.
- The next usage is for object persistence. If you want to persist objects in an application, if you use a relational database, you would usually normalize objects into multiple tables, it's like you disassemble a car when you park in your garage overnight, certainly not convenient. Now with XML, it's much more flexible, you can use a single column to contain at least hiearchical data. More importantly, you can create indexes on these persistent objects. Unlike LOBs, you have no way to do that. So that makes XML for persistent data much more efficient when you need to search. Do you ever have an experience that you wish you could store an array into a single column yet you still can search it? Now you can achieve that using XML. In an extreme case, you could design all the tables with a primary key and a single XML column.
- Yet another application scenario is to migrate from legacy data models, such as network or hieararchical data model. If you migrate a hiearchical data model to a relational model, you need to introduce artificial keys, but with XML you don't need that, it's straight-forward, and you will have benefit of a high-level declarative query language.
- Next application scenario is to generate Web pages because you can use XHTML for web pages that can be generated directly from an SQL/XML statement.
- Last but not least, you can develop web services using DB2 applications to provide or consume web services directly because the web services use XML data. XML support in DB2 enables end-to-end XML solutions in an SOA environment.
In summary, DB2 pureXML makes XML data consumable, and provides the following business value:
- First, it will accelerate application development, reduce system complexity, and improve developer productivity. This will lead to improved time-to-market and reduce IT backlog.
- Second, it increases business agility, it will be able to help develop end-user customizable applications, easily accommodate the changes to data and schemas, and update applications rapidly and reduce the maintenance cost.
- Third, it can improve the business insight, help you develop applications that access information in otherwise unexploited documents, including for business intelligence and business monitoring.
- Last, you no longer need to store XML in a separate system, you can consolidate system resources onto System z, to reduce floor spaces, lower the energy consumption and people cost, and also use specialty engines for XML processing with low CPU cost, and increase security of critical XML data and simplify regulatory compliance.
If you have projects that you feel will benefit from pureXML, I urge youto consider starting with a pureXML PoT I mentioned last week.
The views expressed here are mine and do not necessarily reflect IBM's official position. Contact me at gzhang at us.ibm.com if you'd like a private conversation on the topic.
Migration of DB2 for z/OS systems to a new release is a very important task that requires careful planning and systematic execution, which takes time. Yet pureXML in DB2 9 for z/OS is very attractive to application developers, but it requires DB2 9 NFM. If you have a need to start an application development using pureXML and cannot seem to wait for the DB2 systems to get to DB2 9 NFM, you may want to consider the following option (which to many is not news): developing the new application using DB2 Express-C using the pureXML features and IBM Data Studio on a Windows or Linux platform (both are no charge). You may get in sync with the migration to DB2 9 NFM for application system testing, performance testing, and deployment. Here is some more info.
DB2 Express-C is a no-charge community edition of the DB2 for LUW data server. It can be downloaded from DB2 Express-C download. Before you start, you may want to look at the Getting Started book.
You also need to download the right version of DB2 Express-C for your platform. If you have a Windows platform, look at this page to determine if your Windows is 32-bit or 64-bit. If the properties of your computer does not say 64-bit, then it's likely to be 32-bit.
You also need an IBM ID to download for license agreement. Getting an IBM ID is technically very easy. Use IBM Installation Manager to download and install, which is faster. After download, you need to unzip it and run the setup.exe or something like that. It should be fine to use the typical installation type. Extremely easy :-).
IBM Data Studio Developer
To develop using SQL/XML, Java, stored procedures, and pureQuery, you can start with the Data Studio Developer no-charge features. After you install DB2 Express-C and launch it, you can download Data Studio Developer and install it. The link is listed on the DB2 Express-C download page also. Again, it's straight-forward.
Using Data Studio Developer to explore SQL/XML queries
This is my Windows XP experience. Launch the Data Studio Developer, you will see the SAMPLE database already listed in the Data Source Explorer, if you launched DB2. Right click on the SAMPLE DB, and choose "connect". Use your Windows username and password to connect. Once it's connected, you can look into "Schemas", and under your username, find "Tables". There are a few tables in the SAMPLE DB that contains an XML column: CUSTOMER, PRODUCT, and PURCHASEORDER that contains some sample data. You can list the table content by right click on the tablename -> "Data" -> "Returns all rows" or "Sample content", to display table content.
From SAMPLE DB name, right click on it -> "New SQL Script", you can copy and paste an SQL/XML query into the window. Then Choose main menu "Run" -> "Run SQL". The status and result will show. Try this query:
SELECT *FROM PURCHASEORDERWHERE XMLEXISTS('declare default element namespace "http://posample.org"; $po/PurchaseOrder[@Status="Unshipped"]' PASSING PORDER AS "po")
Remember, in order to move the queries and applications to z/OS easily, you need to stick to the syntax and scope that is supported by z/OS pureXML (and also by LUW). One of the key detailed differences is that in LUW, you must use a variable, such as $po shown above, as the starting context item for an XPath path expression, which is optional for z/OS. This is consistent for XMLQUERY, XMLEXISTS, and XMLTABLE functions.
Once you see the XML column data, you can click on the icon, and an XML editor window will open.
You could open a new Data Project, from main menu "File" -> "New" -> "Data development Project". Say you created Project1. Within Data Project Explorer, under Project1, right clock on "SQL Scripts" -> "New" -> "SQL or XQuery Script", you can get a SQL Script window also.
If you need to have a Java project, you need to create a project with the corresponding type: from main menu "File" -> "New" -> "Project", a dialogue window will open...
I hope you will like the combination of Data Studio with pureXML, if you haven't tried yet. There are much more features to explore in IBM Data Studio.
The XMLTABLE() function can be said to be the most powerful and versatile function in SQL/XML. In one of the previous postings, I listed some highlights on using the XMLTABLE function. Here I give some simple examples, especially cascading XMLTABLE functions in an SQL query.
For illustration, assume that we have the following table and have inserted one document below, which is very typical in the real world with header and body in a form.
CREATE TABLE MYTABLE (XDOC XML);
INSERT INTO MYTABLE VALUES('<form> <formheader id="1234"> <head1>head1</head1> <head2>head2</head2> </formheader> <formdata> <data1 did="d1234"> <data11>data11</data11> <data12>data12</data12> </data1> <data2 did="d2345"> <data21>data21</data21> <data22>data22</data22> </data2> <data3 did="d3456"> <data31>data31</data31> <data32>data32</data32> <data33>data33</data33> </data3> </formdata></form>');
Assume that we want to present the data in this document as a table, with data of level 1 under "formdata" as the rows, combined with some data from the header. The key to the XMLTABLE function is that the row-expression identifies nodes for the rows. The data from the header will be repeated for each row. The first XMLTABLE query using direct navigational paths looks like the following:
SELECT TX.*FROM MYTABLE, XMLTABLE('/form/formdata/*' PASSING XDOC COLUMNS ID VARCHAR(5) PATH '../../formheader/@id', HEAD1 VARCHAR(5) PATH '../../formheader/head1', NAME VARCHAR(8) PATH 'fn:local-name()', DATA XML PATH '.') TX;
Here is the result:
ID HEAD1 NAME DATA1234 head1 data1 <data1 did="d1234"><data11>data11</data11><data12>data12</data12></data1>1234 head1 data2 <data2 did="d2345"><data21>data21</data21><data22>data22</data22></data2>1234 head1 data3 <data3 did="d3456"><data31>data31</data31><data32>data32</data32><data33>data33</data33></data3>
To avoid repeated evaluations of column expressions for the header data, the following XMLTABLE query avoids the parent axis using two cascading XMLTABLE functions:
SELECT TX1.ID, TX1.HEAD1, TX.*FROM MYTABLE, XMLTABLE('/form' PASSING XDOC COLUMNS ID VARCHAR(5) PATH 'formheader/@id', HEAD1 VARCHAR(5) PATH 'formheader/head1', FORMDATA XML PATH 'formdata') TX1, XMLTABLE('$fd/*' PASSING TX1.FORMDATA as "fd" COLUMNS NAME VARCHAR(8) PATH 'fn:local-name()', Data XML PATH '.') TX;
It results in the same result as above.
Now let's assume we want to lower the level for the rows to the level 2 data under "formdata". We only need to get to the next level for the row expression by appending "/*" to the row path expression. Note that since each element at the data level has a unique name, the queries here illustrate the legitimate use of "*" - the wild card name test.
SELECT TX1.ID, TX1.HEAD1, TX.*FROM MYTABLE, XMLTABLE('/form' PASSING XDOC COLUMNS ID VARCHAR(5) PATH 'formheader/@id', HEAD1 VARCHAR(5) PATH 'formheader/head1', FORMDATA XML PATH 'formdata') TX1, XMLTABLE('$fd/*/*' PASSING TX1.FORMDATA as "fd" COLUMNS NAME VARCHAR(8) PATH 'fn:local-name()', Data XML PATH '.') TX;
The result looks like the following:
ID HEAD1 NAME DATA1234 head1 data11 <data11>data11</data11>1234 head1 data12 <data12>data12</data12>1234 head1 data21 <data21>data21</data21>1234 head1 data22 <data22>data22</data22>1234 head1 data31 <data31>data31</data31>1234 head1 data32 <data32>data32</data32>1234 head1 data33 <data33>data33</data33>
What if we want to get "@did" from the level 1 data as a column of the rows? We can certainly add "../@did" as a column expression. But we can avoid the parent axis by introducing another XMLTABLE function. Can you try to write the query yourself?
-Guogen (Gene) Zhang (GGZ)[Read More
In the XMLQuickDemo, I illustrated some simple DDL statements and SQL/XML queries. In the second CLP file (2ListObjects.CLP), I tried to list some database objects related to an XML column. I was a bit lazy for the first query used to find the tables created, as I knew the implicitly created XML table name is "X"+<base table name> for the first XML column (this is only true for the first XML column of the table and the table name is shorter than 18 bytes). So I used this:
WHERE NAME LIKE '%PUR%' AND CREATOR = CURRENT SQLID;
as the WHERE predicate to find the tables just created, including the base table PURCHASEORDERS and internal XML table XPURCHASEORDERS. However, the right query should use the catalog table SYSIBM.SYSXMLRELS to find the XML table for an XML column. Let's assume I have the following two DDL statements to create a table and an XML index, and my SQL ID is DDS0075:
CREATE TABLE MYTABLE(XDOC XML);CREATE INDEX MYINDEX ON MYTABLE(XDOC) GENERATE KEYS USING XMLPATTERN '/form/formdata/data1/*' AS SQL VARCHAR(20);
I should use the following query to find the XML table corresponding to the XML column(s):
SELECT SUBSTR(XMLTBOWNER, 1, 8) as CREATOR , SUBSTR(XMLTBNAME,1,20) as NAME , T.TYPE , T.DBNAME , T.TSNAME , T.COLCOUNTFROM SYSIBM.SYSXMLRELS XT , SYSIBM.SYSTABLES TWHERE XT.TBNAME = 'MYTABLE' -- table name AND XT.TBOWNER = CURRENT SQLID -- creator name AND T.NAME = XT.XMLTBNAME AND T.CREATOR = XT.XMLTBOWNER;
The SUBSTR function is used to shorten the result length for display. The result looks like the following:
CREATOR NAME TYPE DBNAME TSNAME COLCOUNTDDS0075 XMYTABLE P DSN01018 XMYT0000 3 1 record(s) selected
In DB2 9, by default, each table will be created in its own database (when the tablespace is not specified also, see the table below). The XML tables related to the base table will be always in the same database as the base table. It may not be the best to use this default feature, but for simplicity, I use this for the example. Here are more query examples on the catalog tables to find information about other XML objects and their sample results. They are self-explanatory, except for the characters for the types:
- 'P' for both XML table and XML table space;
- 'N' for NodeID index, and
- 'V' for XML Value index.
- (DOCID index is a regular index on the base table.)
ECHO *****************************************************************;ECHO * GET INFO ABOUT XML TABLE SPACES AND STOGROUPS *;ECHO *****************************************************************;SELECT TP.DBNAME , TP.TSNAME , TP.STORNAME , TP.PQTY , TP.SQTYFROM SYSIBM.SYSXMLRELS XT , SYSIBM.SYSTABLES TB , SYSIBM.SYSTABLEPART TPWHERE XT.TBNAME = 'MYTABLE' -- table name AND XT.TBOWNER = CURRENT SQLID -- table creator AND TB.NAME = XT.XMLTBNAME AND TB.CREATOR = XT.XMLTBOWNER AND TP.TSNAME = TB.TSNAME AND TP.DBNAME = TB.DBNAME;DBNAME TSNAME STORNAME PQTY SQTYDSN01018 XMYT0000 SYSDEFLT -1 -1 1 record(s) selected
ECHO ********************************************************************;ECHO * GET INFO ABOUT INDEXES, INDEX SPACES AND STOGROUPS ON BASE TABLE *;ECHO ********************************************************************;SELECT SUBSTR(IX.TBNAME,1,20) as TBNAME , SUBSTR(IX.NAME,1,26) as IXNAME , IX.IX_EXTENSION_TYPE as IXTYPE , IX.DBNAME , IX.INDEXSPACE , IP.STORNAMEFROM SYSIBM.SYSINDEXES IX , SYSIBM.SYSINDEXPART IPWHERE IX.TBNAME = 'MYTABLE' -- base table name AND IX.TBCREATOR = CURRENT SQLID -- table creator AND IX.NAME = IP.IXNAME AND IX.TBCREATOR = IP.IXCREATOR;-- INCLUDES DOCID INDEX, BUT NOT THE NODEID INDEX:TBNAME IXNAME IXTYPE DBNAME INDEXSPACE STORNAMEMYTABLE I_DOCIDMYTABLE DSN01018 IRDOCIDM SYSDEFLTMYTABLE MYINDEX V DSN01018 MYINDEX SYSDEFLT 2 record(s) selected
ECHO ********************************************************************;ECHO * GET INFO ABOUT INDEXES, INDEX SPACES AND STOGROUPS ON XML TABLE *;ECHO ********************************************************************;SELECT SUBSTR(IX.TBNAME,1,20) as TBNAME , SUBSTR(IX.NAME,1,26) as IXNAME , IX.IX_EXTENSION_TYPE as IXTYPE , IX.DBNAME , IX.INDEXSPACE , IP.STORNAMEFROM SYSIBM.SYSXMLRELS XT , SYSIBM.SYSINDEXES IX , SYSIBM.SYSINDEXPART IPWHERE XT.TBNAME = 'MYTABLE' -- base table name AND XT.TBOWNER = CURRENT SQLID -- table creator AND IX.TBNAME = XT.XMLTBNAME AND IX.TBCREATOR = XT.XMLTBOWNER AND IX.NAME = IP.IXNAME AND IX.TBCREATOR = IP.IXCREATOR;-- NODEID INDEX ONLYTBNAME IXNAME IXTYPE DBNAME INDEXSPACE STORNAMEXMYTABLE I_NODEIDXMYTABLE N DSN01018 IRNODEID SYSDEFLT 1 record(s) selected
-- ALL THE TABLES IN A DATABASE:SELECT SUBSTR(CREATOR, 1, 8) as CREATOR , SUBSTR(NAME,1,20) as NAME , T.TYPE , T.DBNAME , T.TSNAME , T.COLCOUNTFROM SYSIBM.SYSTABLES TWHERE DBNAME = 'DSN01018';
CREATOR NAME TYPE DBNAME TSNAME COLCOUNTDDS0075 MYTABLE T DSN01018 MYTABLE 2DDS0075 XMYTABLE P DSN01018 XMYT0000 3 2 record(s) selected
-- ALL THE INDEXES IN A DATABASE:SELECT SUBSTR(IX.TBNAME,1,20) as TBNAME , SUBSTR(IX.NAME,1,26) as IXNAME , IX.IX_EXTENSION_TYPE as IXTYPE , IX.DBNAME , IX.INDEXSPACE , IP.STORNAMEFROM SYSIBM.SYSINDEXES IX , SYSIBM.SYSINDEXPART IPWHERE IX.DBNAME = 'DSN01018' AND IX.NAME = IP.IXNAME AND IX.TBCREATOR = IP.IXCREATOR;
TBNAME IXNAME IXTYPE DBNAME INDEXSPACE STORNAMEMYTABLE I_DOCIDMYTABLE DSN01018 IRDOCIDM SYSDEFLTXMYTABLE I_NODEIDXMYTABLE N DSN01018 IRNODEID SYSDEFLTMYTABLE MYINDEX V DSN01018 MYINDEX SYSDEFLT 3 record(s) selected
Please note that, as you may well know, for utilities, there is a better way to define the relevant objects using LISTDEF. You can use REPORT TABLESPACESET to list relevant objects too. The above queries can help you to find specific information. If you need to find the storage space usage, you can run the STOSPACE utility on STOGROUPs, and then add the space used by XML objects. Or better yet, if you group tables with database as units, you can calculate the space based on DBNAME to include XML object space automatically. For example, the following query displays the number of table spaces and also the corresponding total storage space, including XML table spaces, for each of the databases:
SELECT DBNAME, COUNT(*), SUM(SPACE) -- use SPACEF if necessaryFROM SYSIBM.SYSTABLESPACEGROUP BY DBNAME;
If you just want to find XML table space information, use this:
SELECT DBNAME, COUNT(*), SUM(SPACE)FROM SYSIBM.SYSTABLESPACEWHERE TYPE = 'P'GROUP BY DBNAME;
To help you understand how implicitly created XML objects impact your naming convention for physical database design and administration, the following table lists some of the attributes related to XML objects under different CREATE TABLE options (probably not exhaustive). The short notations used in the table are the following:
NM: Name, TS: Table Space, ST: Stogroup, BP: Buffer pool, TB: Table, IS: Index Space, G: Generated. Deflt: Default from subsystem parameters.
From this table, we can see that to control the storage allocation with the benefit of database grouping, better use the first two forms of the CREATE TABLE statements, which is probably your normal cases. To avoid concurrency issues related to DBD (Database Descriptor), a database should not include too many objects too. Also note that the XML indexes are associated with the base table in the catalog, even though they are associated with the XML table internally (reflected in OBD - Object Descriptor).
DB2-created XML objects is somewhat a departure from the previous practices. The main reason is for convenience of ISV packaged applications, and a direction for DB2 DDL to focus more on tables, less on table spaces and stogroups. I hope that this new convention does not introduce too much inconvenience for you, once you understand the relationships. Share your experiences with us.
On this past Monday and Tuesday, I attended the first DB2 9 for z/OS pureXML Proof of Technology (PoT) event in St. Louis, organized by Paul Bartak, one of our Senior Executive IT Specialists in the field, at the IBM Customer Briefing Center facility near St. Louis International Airport.
I stayed there for two days. The first day is for a group of IBMers (with one customer who couldn't come the second day). We listened to Paul's presentation and exercised through the labs, learning pureXML and testing the PoT. I had fun setting up the network for ThinkPad's with VMWare, and testing connections to the DEMOnet DB2 9 for z/OS on a native z machine at Austin in the morning. I was amazed that a bunch of software guys could handle the hardware and network wiring without too much trouble (well, with some remote assistance). We even identified a malfunctioning 5-port Ethernet switch.
On the second day, there were 12 participants, from nearby three DB2 for z/OS clients, including a university from Illinois. They are DBAs, application architects and developers, and data architects. Paul gave a presentation, about 75 minutes, covering the basics of XML and pureXML features, to prepare for the labs. Then everyone went through the labs on one's own pace. I was impressed how fast the participants did. It took me about 3.5 hours to finish the lab on my first-day trial. But I saw many of these customers finished the lab in a shorter time than I did! One DBA told me that she really liked it. She had the concept and now with the lab, it's really flowing. Oh, she thanked me for delivering XML capabilities in DB2 for z/OS.
I had chance to chat with some of the participants on their intended use of pureXML. They were thinking to use XML to keep logs for auditing, store diverse system configuration information, or store purchase orders, or even for displaying data structures during debugging, etc. And I also learned some features that would be nice to have in the future.
This pureXML PoT uses a real DB2 9 for z/OS system, and is unique in the following aspects:
- It covers all the pureXML main features, including DBA tasks, SQL/XML queries with XPath, schema validation and decomposition, with real examples.
- It uses sample tables to generate XML data so there is no need to prepare XML data from other places.
- It uses the familiar tools, SPUFI, SDSF, CLP and also IBM Data Studio for schemas, and OSC for query plans and tuning.
- It includes all the DDL, Schemas, and Queries so you can work further on your own after the PoT.
We are lucky to have someone like Paul, who knows what customers need, understands the technology, and put lot of efforts to deliver a PoT that people can really learn to understand the technology. I'd like to thank Mark Wilson for his contribution to this PoT also.
This pureXML PoT will be offered in different areas. If you are interested in (having your folks) attending such pureXML PoT, please contact your local DB2 advisor, or contact me or Paul. It will also be offered as a Hands-on Lab during IOD 2008 Conference at Las Vegas,
Session: HOL-2584A DB2 9 for z/OS pureXML for DBAs
Time: Wed, 29/Oct, 10:00 AM - 01:00 PM
Location: Mandalay Bay South Convention Center - Lagoon D
Take note if you are attending IOD and interested in playing pureXML on DB2 9 for z/OS there.
--Guogen (Gene) Zhang (GGZ)[Read More
On this last day of year 2008, I'd like to share with you on how easy it is to generate XML from relational data using SQL in DB2 V8 and DB2 9 for z/OS.
To start with, I'd like to show a simple example of SQL functions that generates the following XML:
<name> <firstname>John</firstname> <lastname>Doe</lastname></name>
The SQL functions to construct this XML is the following:
XMLELEMENT(NAME "name", XMLELEMENT(NAME "firstname", 'John'), XMLELEMENT(NAME "lastname", 'Doe') )
As you can see, the SQL functions nest exactly corresponding to the XML data nesting structure. These functions are standard SQL XML constructors or sometimes called XML publishing functions. Notice that the "NAME" keyword is required by the standard to indicate that an SQL ID ("name") is used as a name for the element, as opposed to using the value of a column ("name") in other SQL functions. Double quotes are necessary to keep the name case-sensitive.
In the next example, I'd like to show how you can group a set of elements into one XML value and nest under an element using XMLAGG. This query assumes you have an employee table:
+---------+--------------+------------+--------+ | dept | firstname | lastname | ... | +---------+--------------+------------+--------+ Shipping White Smith ... Shipping Paul Taylor ... Shipping John Doe ... ...
And you want to group employees under a department element like the following:
<Department name="Shipping"> <Emp> <firstname>John</firstname> <lastname>Doe</lastname> </Emp> <Emp> <firstname>White</firstname> <lastname>Smith</lastname> </Emp> <Emp> <firstname>Paul</firstname> <lastname>Taylor</lastname> </Emp></Department>...
The query to generate this XML data is as follows:
SELECT XML2CLOB( XMLELEMENT(NAME "Department", XMLATTRIBUTES (e.dept AS "name" ), XMLAGG(XMLELEMENT(NAME "Emp", XMLELEMENT(NAME "firstname", e.firstname), XMLELEMENT(NAME "lastname", e.lastname) ) ORDER BY e.lastname ) ) ) AS "dept_list"FROM employees e GROUP BY dept;
This is what people typically call "de-normalization" (of relational data). Notice that XMLAGG takes ORDER BY clause to specify the order of elements within a group. In this example, it's to order the elements based on the lastname. This query should work on both V8 and V9. In DB2 9, additional constructors are provided to make them complete. V8 queries continue to work in DB2 9.
There are many advantages of using SQL to generate XML from relational data. Here are a few of them:
- It is easy to extend existing COBOL or PL/I applications to generate XML by using SQL. It works for non-Unicode data.
- It is much easier to generate XML than using DOM in programming languages like Java and C/C++, enabling higher productivity.
- It makes applications light-weight, with high-performance. It delivers the XML value in one chunk, instead of piece-by-piece.
What is a QName?
Before listing some additional resources, I'd like to trickle in some terminology related to XML for beginners (please skip if you already know). An element or attribute has a name, officially called a QName, for Qualified Name, such as "hr:name", where "hr" is a prefix and "name" is a local name. Both prefix and local name are NCNames, for No-Colon Names (names without a colon in them). The prefix is for the namespace, and needs to be declared with a namespace declaration like the following:
<hr:name xmlns:hr="http://www.example.com/humanresource"> ... </hr:name>
Where "xmlns" signifies a namespace declaration, and "hr" is the namespace prefix being declared. The value "http://www.example.com/humanresource" is a namespace name or namespace URI. The reasons that XML namespace is designed this way are the following:
- To avoid conflict (the purpose of a namespace), it has to be some unique identifiers across the Web. That will be the URI (Uniform Resource Identifier).
- A URI cannot be used directly in an element name or attribute name because it does not follow the syntax rules for element/attribute names in general.
- A prefix is used instead to make it follow the name syntax rules, and it is shorter usually.
An element or attribute is said to be in the scope of a namespace if it is between the start tag (inclusive) containing the namespace declaration and the corresponding end tag. And an element has a set of in-scope namespaces. A namespace declaration is said to be superfluous if the namespace being declared is already in the scope.
An NCName becomes a QName if the prefix is empty, that is, either there is no namespace (No Namespace is used), or a default namespace is used (a namespace is bound to the empty prefix). A colon is not used in these cases. The following example illustrates a default namespace declaration:
<name xmlns="http://www.example.com/humanresource"> ... </name>
For more SQL/XML constructor details and examples on generating XML from relational data, please look at this "old" presentation of mine from IM Tech Conference '05: Generating XML from Relational Data Using SQL with Ease. Some of the examples are included in the XMLQuickDemo 7XMLGEN1.CLP and 8XMLGEN2.CLP. I've also made XML4TABLE package public. This package contains two SQL PL stored procedures for DB2 9 for z/OS: XML4TABLE and XML4ENTITY1, which can generate XML for an entire table or a given entity from multiple RI-related tables. Read the readme.txt file in the package for details. Let me know if you have any feedback.
Also, check IBM Data Studio and Rational Data Architect for tool support in generating SQL/XML queries to generate XML from relational data. And don't forget to check SQL Reference and XML Guide, and other online resources.
I wish you a very happy and prosperous new year 2009!
-Guogen (Gene) Zhang[Read More
Yes, you almost need nothing to set up before you try out pureXML in DB2 9 for z/OS if you are in DB2 9 NFM and have SPUFI for DB2 9. Here is how: download XMLQuickDemo I just posted in my previous blog entry. And log on to TSO and get to SPUFI, and copy the SQL statements from the download, and run from there.
Within the downloaded zip file, statements from the files with a name starting with "1" to "8" can be tried out in sequence from SPUFI without WLM/Java/XSR setup. "ADropTable.CLP" is used to clean up after yourself.
Chances are that you will run successfully. Possible problems you may encounter:
- Authorization problem: you need the privilege to create table and use bufferpool BP16K0 for XML column data.
- SPUFI abend, most likely you need the correct SPUFI version for V9.
- SQL error complaining z/OS XML system services, most likely you are using z/OS R1.7 that requires APAR OA16303 for XML system services.
- DB2 complains syntax on XPath, most likely you need to set Editor case-sensitive, also pay attention to statement termination character as semi-colon may be used within XPath so # is used.
- Also set output record length long enough to see the output. Otherwise there will be truncation of XML result. Check continue after warning to get all rows after truncation.
When you try your own XML documents and queries, especially with large XML documents or results, you may encounter sqlcode: -904 sqlstate: 57011sqlerr Message: UNSUCCESSFUL EXECUTION CAUSED BY AN UNAVAILABLE RESOURCE. REASON 00C900D1, TYPE OF RESOURCE 00000907, AND RESOURCE NAME... which means you need to increase LOBVALA zparm setting, default was 1024KB. DB2 uses LOB functionality for XML data bind-in and bind-out, but no LOB used for XML database storage.
DB2 9 XML column type and built-in functions on XML type do not require XML schemas. Therefore, if you don't use XML schemas, you don't need XSR, which requires Java stored procedure setup and WLM setup. Only when you need XML schemas for schema validation or annotated-schema decomposition, you need the additional setup.
Once you go through these, you can try to embed SQL/XML queries in your COBOL or PL/I applications. You can just use LOB or VARCHAR host variables to hold XML data.
Happy playing with pureXML!
The XMLTABLE() function is probably the most powerful function in manipulating XML data using SQL/XML. Some of its uses include, but not limited to, the following:
- to achieve XQuery capability using SQL/XML with XPath. See SQL/XML queries for XQuery usecases.
- to decompose a large document into smaller documents for storage.
- to generate relational view over stored XML data, to bridge the gap between XML storage and relational application software, and also provide reporting over XML data.
Since there are other articles giving detailed account of XMLTABLE function, for example, Matthias's two-part articles of "XMLTABLE By Examples" Part 1 and Part 2, I will just list the highlight on using the XMLTABLE function here:
- XMLTABLE and XMLCAST were delivered post V9 GA. Apply PK51571, PK51572, and PK51573 before trying the functions.
- In general, XMLTABLE() returns multiple rows of multiple columns. The column types can be non-XML SQL types and XML type.
- The column value of non-XML type has to contain at most one item only. Otherwise, an error sqlcode : -16003 sqlstate: 10507 will be returned. In this case, you need to lower the level of nodes the row expression identifies, and use "." (self) in the column path expression if necessary.
- Use XMLNAMESPACES to declare namespaces that are needed across the row path expression and column path expressions, like this: XMLTABLE(XMLNAMESPACES(), ...). The namespaces declared in the row path expression do not cover column path expressions (according to the SQL standard).
- Avoid using the parent axis ("..") due to its overhead. If you need to lower the level, try to cascade XMLTABLE functions. The first XMLTABLE() produces rows at a high level so that the parent axis can be avoided, and the next XMLTABLE function takes as the input the XML column output of the previous XMLTABLE, and generates other columns.
- If you use XMLTABLE to decompose a document into smaller documents, you need to use XMLDOCUMENT() constructor to wrap the pieces into a document to insert into an XML column.
- You can use XMLTABLE() to extract one row (a tuple) of multiple columns, in particular, multiple heterogeneous pieces from different portions of a document, to be used in constructing a new document.
- You can use XMLTABLE() to iterate through a sequence of homogeneous items that XMLQUERY was not able to separate them, each to be used in constructing a new document. XMLAGG() can be used to aggregate multiple parts back into one value.
- You can use XMLTABLE() to define a relational view so SQL queries can treat XML data as relational data, to bridge the gap between XML and relational. This can also achieve more than XQuery 1.0 can do for reporting and light analytics. For heavy analytics, you may need to materialize XMLTABLE result into tables. MQT support for XMLTABLE() is not available at this point.
- You can use XMLTABLE() and XMLPARSE() to consume Web services directly in SQL by using SOAPHTTPNV or SOAPHTTPNC UDFs in DB2 (more to come).
If you have any questions, don't hesitate to contact me or post questions at DB2 pureXML forum.
-Guogen (Gene) Zhang (GGZ)
Last Friday (October 17) was a long day for me as I returned to San Jose, California from Warsaw, Poland, after attending IDUG Europe 2008 Conference, held in Hilton Warsaw from October 13 to 16 (October 17 is for one-day seminars).
Here are some important impressions from my personal experiences at the conference.
- Motivating opening keynote presentation by Marc Woods, World Champion Paralympic Athlete. Inspiring, yet humorous stories on winning gold medals, especially the one for team melody swimming, emphasizing the importance of team work and leadership.
- Ever-increasing interests in pureXML.There were quite a few sessions about pureXML.Matthias presented two sessions about SQL/XML and best practices, and I presented one for DB2 for z/OS DBAs. I have some slight update on my presentation, available here:DB2 9 for z/OS pureXML - Survival Guide for DBAs.These three sessions were completely full, some had to be turned away. Next time, the planners should probably arrange larger rooms for XML sessions to accommodate more audience.
- We had one Special Interest Group (SIG) session on XML. Participants were Phil Grainger of CA, Jim Dee of BMC Software, Matthias Nicola of DB2 for LUW, and myself. Moderated by Hans Miseur. There were very interesting discussions.
- There were threads for application development that may be for z/OS or LUW or both. Not many attendees realized there was another session on the Java driver for XML, by Bilung (Brandon) Lee, in the application development thread. I hinted Brandon to include something like z/OS or XML in his title to attract more people.
- There was a presentation on MQTs on z/OS also. The room was full also.
- There were many other excellent sessions to attend, including Curt's closing keynote.
In one of Matthias' session, one of z/OS customers asked about pureXML samples. I have put IOD 2007 DB2 9 for z/OS XML Hands-on Lab material for download. This package contains the lab material used in IOD 2007 XML lab, including DDL, Java samples, query samples, examples for schema registration, validation, and decomposition. This material is for educational use only without any warrant, express or implied, as usual.
Some fun fringes follow.
The DB2 25th Anniversary Party on Wednesday October 15 night was held in the middle of nowhere. The bus drivers got lost, on very narrow country road, and the bus I was on took about 1.5 hours to get to the party place. It was amazing that the driver was able to turn the bus around in the dark on an extremely narrow road in the field. The Polish food was good. There was traffic congestion near 11pm on Warsaw streets on the way back to the hotel.
Jim Dee of BMC software was the moderator for my session. He made fun of me in his introduction - saying that I was a genius, and knew everything about XML in the world. That definitely helped my session evaluation! :-)
We were recommended by a US attendee (forgot his name) a restaurant near Hilton Hotel called The Inn under the Red Hog, which used to cater forcommunist dignitaries like Lenin, Castro, and Chairman Mao etc. Four of us (Chinese attendees) tried it out for Thursday dinner. We shared three main dishes: roasted pork ribs, a pork shank, and Mao's chicken (just like orange chicken). They were Polish dishes, heavy flavor (need beer to dilute), big portion, enough for us. Compared with Sekret in the old town, recommended by Hilton Hotel, the Red Hog seems to be better and less expensive. Reservation is recommended as we were rushed out by next reservation.
Looking ahead, next week, IOD Global Conference will be on at Las Vegas. We have many pureXML presentations and Hands-on Labs. If you'd like to try the hands-on labs, I urge you to come even if the enrollment was full. We will admit stand-byers after 5-10 minutes in case of no-shows.
In this first posting of the new year, I'm going to share something non-technical. We are looking forward to the uptake of the pureXML projects in many of our clients and ISVs. As for any new technology, pureXML adoption takes leadership and courage. The reward will be ample, especially in this unprecedented, tough global economy. With pureXML, you could support new products and services in reduced time to market, achieve more with the same resources, and improve the bottom-line of the businesses.
Having read the book "Primal Leadership - learning to lead with emotional intelligence" by Daniel Goleman, et al. before, I recently came across a series of three Harvard Business Review (HBR) articles that are pretty much short versions of the book. "Leadership that gets results" focuses on six leadership styles.
The resonant styles are:
- the visionary/authoritative styles,
- the coaching style,
- the affiliative style, and
- the democratic style.
And the dissonant styles are:
- the pacesetting style and
- the command/coercive style.
"The best leaders don’t know just one style of leadership — they’re skilled at several, and have the flexibility to switch between styles as the circumstances dictate."
"What makes a leader" focuses on the emotional intelligence that is necessary to be effective leaders. The four emotional intelligence domains/skills are:
- social awareness, and
- relationship management(empathy and motivation).
"Primal leadership - the hidden driver of great performance" focuses on a process of five steps to achieve greater emotional intelligence:
- What do you want to be?
- Who are you now?
- How do you get from here to there?
- How do you make change stick?
- Who can help you (in the above process)?
If you are not familiar with this research result, I highly recommend you search on the web to find some summary in more detail, or get the book or articles.
Another book that I read over ten years ago is also one of the best-sellers many know well: "The seven habits of highly effective people" by Stephen Covey. The seven habits are:
- be proactive;
- begin with the end in mind;
- put first things first;
- think win/win;
- seek first to understand, then to be understood;
- sharpen the saw.
"The key to success in business is action. But in most companies, people are rewarded for talking - and the longer, louder, and more confusingly, the better." (from HBR article "The smart-talk trap") Talking is probably better than deceiving and bad politics. And trustworthiness and integrity are essential in a leader's character. (I don't know why I'm writing these sentences.)
I wish you fulfill your new year resolution! And make pureXML one of your technology priorities this year.
- Guogen (Gene) Zhang (GGZ)[Read More
You may have heard and seen quite some value claims about pureXML in DB2 9, and a lot of details about the pureXML features. You may wonder how they connect in the real world. I will try to present the scenarios and how pureXML features can play an important role, abstracted from real-world experiences, at the teleconference to be held on April 21, 2009. Here is the invitation in PDF. I will be speaking and answering your questions live. The presentation focuses on the scenarios that pureXML can deliver the values and how you can use pureXML to:
- Accelerate application development,
- Achieve greater business agility, and
- Improve business insights.
In addition, System z has its well-known advantages of reliability, availability, scalability, security, and savings in space, power consumption, and people cost. DB2 9 for z/OS pureXML and other XML capabilities on System z make your applications less reliant on additional systems to handle XML, reducing system complexity and improving performance.
In today's extremely competitive environment, time-to-market may determine life or death of a business, or to a less extent, profit or loss. In that kind of situation, using feasible technologies to deliver a service quickly becomes critical. And pureXML is one element of this type of technologies.
Please invite your clients, colleagues, and friends to register for this teleconference: ibm.com/software/systemz/telecon/apr21. See you then.
Last week, December 8 and 9, 2008, I visited Kansas City and Topeka with the help of Jeff Mucher, a DB2 advisor from IBM at Dallas. On the 12/8 afternoon, I presented Introduction to pureXML in DB2 9 for z/OS to the Heart of America DB2 Regional User Group at a hotel in Overland Park, with an audience of over 30 people. In the middle, I also did a live demo using CLP (Command Line Processor), connecting to the DB2 9 on a DemoNet native machine at Austin. I have put the CLP scripts and their output capture called XMLQuickDemo online so you can download to try out yourself. The presentation was well received with quite some interesting questions and discussions. At least one friend from DST told me right after the presentation that the DB2 for z/OS XML features are very impressive. I was happy to hear that!
Here are a few words I've heard from customers describing XML in DB2 9 for z/OS or the developers (including me :-) ): quite impressed/very impressive, brilliant, genius, clever. Impressive, huh?!
On the 12/9, Jeff and I visited a DB2 client at Topeka, KS. We met with about a dozen developers, DBAs and System Programmers to introduce DB2 XML features and discuss their application scenarios. We saw DB2 XML well fit in three scenarios for their applications:
- XML for object persistence.
- XML for flexible change representations.
- XML for event log for auditing/regulatory compliance.
The common theme in these usage scenarios is flexibility, flexibility, and flexibility. We saw more and more of these XML application patterns.
Beth wanted me to confirm if her description of XML is correct: it's more like a delivery truck, it could be UPS, FedEX, or any other truck, we don't care. We handle it based on its content. YES!
This trip had special meaning for me. I was attending KU for 2 years over 14 years ago, and this was the first time I got back to Kansas after so many years. And I was able to meet with one of my classmates not seen for over 14 years, and also one of our former interns, now a CS faculty member at KU. Also I seldom experienced snowy weather these years, and it was snowing on 12/9! What a colorful trip!
I'd like to describe one performance result at our lab of a workload that simulates an auditing application. It uses one XML column to store all kind of events in small XML documents. There are 210 XML indexes created on the XML column. For each XML document, there are about 10 indexes that will have keys generated, and the rest does not have hit. This large number of indexes enable efficient diverse queries on the event log.
The result is that the overhead of 210 XML indexes caused only 40% degradation compared with only 10 XML indexes that generates XML keys always. This is a pretty good result! We have an APAR PK75613 (overriding PK66218) to improve XML index keygen performance. Applying this will help reduce XML value index overhead in general.
By the way, if you'd like to know what APARs to apply for XML features in DB2 9 for z/OS, look into the info APAR II14426.
(Updated on April 22, 2009 PST)
I'd like to share with you the following noteworthy courses, articles, and news related to DB2 pureXML:
No matter how familiar one is with SQL, DB2, and relational data model, the first time getting exposed to XML and XPath, she would likely feel overwhelmed. Getting into the XML world is not a small step. But here I'd like to demystify XPath for SQL people who is just getting started with XML and XPath. I will give you a very gentle introduction to XPath and how you can use it to query XML data in SQL/XML.
It is actually very easy to start: we need to understand thatXML data model is a natural but significant extension to the relational model, it features two very powerful structural capabilities - nesting and repeating. First, it allows nesting, to any level (DB2 allows 128 levels of nesting maximum). Nesting is very common in data structures. For example, you want to separate a name into first name and last name, then you have two levels. You can view a table to have three levels: table level, row level, and column level. When you reference a column with tablename.columnname, you actually use tablename to reference each row in the table, then use the column name to get to a column in the row. In most programming languages, you use dot (.) to separate the field names at each level, such as name.firstname. You can have an XML document as simple as name with two sub-elements - firstname and lastname in an XML column like this:
<name> <firstname>Guogen</firstname> <lastname>Zhang</lastname></name>
Now how can you get the firstname and lastname in the document? XPath is the only solution in SQL: You can use /name/firstname to get the firstname, and /name/lastname to get the lastname. So what's the difference of XPath /name/firstname from name.firstname? You start with a slash (/), and replace the dot with slash. Each slash gets you to the next level. You can use that to get any field or structure in XML data.
The second feature is repeating, like an array, but very flexible. Take a simple example, you can list multiple phone numbers in one XML column, and also associate an attribute for each phone number element to tell it's a wired phone or cell phone, or work phone or home phone:
<phones name="Guogen Zhang"> <!-- fake numbers --> <phonenumber type="home">408-555-1234</phonenumber> <phonenumber type="cell">408-555-2345</phonenumber> <phonenumber type="work">408-463-2012</phonenumber></phones>
To get a specific phone number, you can use array index notation:/phones/phonenumber.  is a short-hand for [fn:position()=2]. You can generalize this predicate to other search condition: /phones/phonenumber[@type ="cell"]. This will get the second phone number also. Here @type is to refer to attribute "type". Attributes are those things in the start tag of an element (between < and >). And you use quotes around "cell" to mean a string comparison. In contrast, for a numeric comparison you would use numeric values without quotes. For example, size = 5.
Two more commonly seen "operators" are "//" and ".." (abbreviated axes). "//" is to look at any level under a certain level (roughly descendant-or-self). For example, //phonenubmer is looking for phonenumber in the entire document. ".." is going back up one level (parent axis). Now you should know why it's called XPath, as it's like file paths in a hierarchical file system.
Now let's look at where XPath is used in SQL/XML. XPath is used in the XMLEXISTS predicate, and the XMLQUERY and XMLTABLE functions. For example, the following query retrieves the quantities of "Baby Monitor" items from purchase order documents:
SELECT XMLQUERY('declare namespace ipo="http://www.example.com/IPO";/ipo:purchaseOrder/items/item[productName = "Baby Monitor"]/quantity' PASSING XMLPO)FROM PURCHASEORDERSWHERE XMLEXISTS('declare namespace ipo="http://www.example.com/IPO";/ipo:purchaseOrder/items/item[productName = "Baby Monitor"]' PASSING XMLPO)#
In the above, we assume we have a table PURCHASEORDERS containing XMLPO column of type XML. In XMLQUERY and XMLEXISTS, PASSING is the keyword to pass the XML column from the SQL world into the XML (XPath/XQuery) world. The XMLEXISTS in the above example is to search for purchaseOder with an item whose productName is "Baby Monitor", while XMLQUERY is a scalar function to extract quantity for the baby monitor items. XMLEXISTS is the only main predicate on XML data in the SQL world (the other predicate on XML is IS [NOT] NULL). You cannot use other SQL comparisons on an XML value, but you get many comparisons within the XML (XPath) world. Also in the above query example, I illustrated the use of XML namespaces. To see purchase order document examples and more query examples, see the XMLQuickDemo I referenced in the previous two blog entries.
Isn't XPath easy to start with? For more details and tutorials on XPath, you can Google "XPath tutorial" to get very good web sites on the top of the list, such as w3schools.com and zvon.com. Try some examples, you will soon become comfortable with XML and XPath! (well - you can pick up the terminology gradually.)
Last week, just before the Thanksgiving holiday in the US, Cindy Saracco from our team, along with Tad Worley, one of our Infosphere Warehouse experts, published a new paper on developerWorks titled: "Create business reports for XML data with Cognos 8 BI and DB2 pureXML: Two techniques to help you get started." We have been getting more and more questions about using Cognos with DB2 pureXML and thought it would be good to get some information out there on the subject. Because XML messages frequently contain important business data, companies are increasingly interested in querying and reporting on this data. The paper takes you through two different methods, step by step, showing examples along the way. Cindy is one of the best technical writers that I know of (inside and outside IBM), so if this topic is of interest, be sure to check out the article.
And in January, we will have a related article coming out titled "Reporting on pureXML data with QMF/DataQuant" so look for that as well.
Kate Riley Tennant
XML is a unique data type in SQL. Understanding its rules is important in getting it right in applications. Luckily, many of the rules are straight-forward. I will cover the following three aspects here:
- XML comparison rules;
- XML assignment rules;
- XML data encoding.
At the end, you will see the application interface for XML is really simple, except for truncation errors. You can use existing string host variable types for XML data or use the new XML host variables. The best practice is to use binary host variable types (such as VARBINARY, BLOB, BLOB_FILE, XML AS BLOB, or XML AS BLOB_FILE) and UTF-8 encoding for XML data in your embedded applications. Similar for Java interface
XML comparison rules
The SQL XML type cannot be compared with any other type, including the XML type itself, using SQL comparison operators. XML can only be tested using one of the following two predicates:
- xml-expr IS NULL, or xml-expr IS NOT NULL
- theXMLEXISTS predicate, you can use XPath operators within XMLEXISTS.
If you need to compare an XML atomic value extracted from an XML document using XPath with an SQL value, you need to use XMLCAST to cast to an SQL type. XMLCAST can be used to cast a value from XML to SQL, or from SQL to XML. Casting from XML to SQL, such as XMLCAST(XMLQUERY('/purchaseOrder/@PONum' PASSING X) as INTEGER), is its main usage, as casting from SQL to XML is done by DB2 implicitly in most cases, such as in parameter passing of XMLQUERY, XMLEXISTS, and XMLTABLE.
XML assignment rules
In DB2 9, XML is not compatible with any other type for assignment. For convenience, an XML column can be assigned a string by implicit XMLPARSE. In addition, the assignment rules for bind-in and bind-out of XML data are quite flexible. XML assignment rules should be considered with two directions: storage assignment and retrieval assignment, similar to string assignment. Here is a summary of the XML assignment rules:
- XML host variables are supported in many host languages (seeapplication programming language support for XML for details). The XML host variables can only be assigned to XML columns and only accept XML values from SQL. That is, XML host variable usage is very restrictive.
- Any host variable of string types (character or binary: CHAR, VARCHAR, GRAPHIC, VARBINARY, BLOB, etc.) can be assigned to XML columns, by implicit XMLPARSE. For example, INSERT INTO MYTABLE VALUES (1, :VARCHARhv), where the second column is of XML type.
- Any host variable of string types (character or binary, except LOB locators) can accept XML values, by implicit XMLSERIALIZE. For example, SELECT XMLELEMENT(NAME "test", 'a') INTO :VARBINhv FROM SYSIBM.SYSDUMMYU;
In other words, you can hold XML values using non-XML string host variables, including file reference variables. Therefore, you can get XML data in and out without using XML host variables in languages like COBOL, PL/I, C/C++, etc. For example, you can simply fetch a SELECT result containing an XML value into a VARCHAR host variable. You can use regular Java interface for XML data also.
One thing that XML is different from other string types is that truncation is serious so it will result in an error (SQLCODE -433), instead of a warning. To avoid truncation, use a large enough buffer, or use FETCH WITH CONTINUE.
You would need the explicit conversion, i.e. XMLPARSE(), from string to XML in cases where the XML data held in a non-XML host var from an application is not directly into an XML column, or the XML data are in a non-XML column. Here are some simple examples for illustration:
INSERT INTO MYTABLE (C1, C2)SELECT XT.*FROM XMLTABLE('row-expr' PASSING XMLPARSE(DOCUMENT CAST(? AS BLOB) ) COLUMNS C1 VARCHAR(10) PATH 'column-expr1', C2 XML PATH 'column-expr2') XT;
UPDATE MYTABLE SET XMLC = XMLPARSE(DOCUMENT VC1);
-- this one shows the value of XML host varINSERT INTO MYTABLE (C1, C2)SELECT XT.*FROM XMLTABLE('row-expr' PASSING :xmlhv COLUMNS C1 VARCHAR(10) PATH 'column-expr1', C2 XML PATH 'column-expr2') XT;
On the other hand, you would need conversion, explicit XMLSERIALIZE(), from XML to string (a LOB) in cases where an XML value is not to bind out to a host var. For example, to insert XML data into a BLOB column:
INSERT INTO MYTABLE2SELECT C1, XMLSERIALIZE(XMLC as BLOB(512K))FROM MYTABLE;
XML data encoding
There are two ways for DB2 to recognize XML data encoding:
- External encoding: if you hold XML data in a character string host var or column, the CCSID associated with the host var or column determines encoding of XML data. DB2 ignores encoding information contained in the XML data if it exists.
- Internal encoding: if you hold XML data in a binary string host var or column, there is no CCSID, and DB2 determines encoding based on the information (BOM - Byte Order Marker, or encoding declaration) contained within the XML data. By default, the encoding is UTF-8.
DB2 uses UTF-8 encoding to process XML data internally, and stores character data in UTF-8. It will convert other encodings into UTF-8 for processing. To avoid any encoding conversion overhead, try to use UTF-8 for XML data whenever possible. For more XML encoding details, see XML data encoding.
Again, if you have any questions, please post to the DB2 pureXML Forum, or contact us at db2zxml at us.ibm.com.
-Guogen (Gene) Zhang (GGZ)[Read More