Since DB2 9 for z/OS beta time in early 2006, I have been frequently asked by customers the question of the value of pureXML in DB2. Although more and more folks now realize the value of XML support in DB2, I'd like to touch upon this topic here.
XML is ubiquitous and is being used to represent all kinds of data in industries, government, and academics etc. If you are an application developer in the enterprise environment dealing with vastly increasing XML data asset and ever-changing business demand and are dissatisfied with the limitations of traditional relational databases, DB2 9 for z/OS provides XML management capability as well as relational data management. Unlike XML-enabled databases and other database offerings that rely on Large Object storage or transform XML into relational data, the pureXML technology is specifically designed and optimized for XML hierarchically structured data, and provides proven enterprise-class reliability, availability, scalability, performance and security for XML data that you have come to expect from DB2 for z/OS.
Since the XML data model is hierarchical, the same as IMS. Quite often, people ask what the differences are between XML and IMS. I think there are two: XML data is very flexible while IMS data structures are rigid; and XML has high-level query languages while IMS has procedural DL/I as its native language (although our IMS friends provided SQL and XQuery interface). SQL/XML with XPath provided in DB2 9 for z/OS makes application processing XML much easier.
We've seen the following XML usage scenarios in DB2, with more and more interesting application scenarios popping up. So this list won't be complete. Let us know if you have interesting scenarios to share.
- The first and most direct case is to process XML data, including industry standard XML format (such as FIXML, FpML, ACORD, UNIFI, MISMO etc.), forms and reports (such as XBRL), you can store it, and retrieve it, just like relational data. If you use COBOL or PL/I, to make up the weak XML processing capability in these languages, you can have applications to invoke SQL to process XML data. If you need to connect to the existing back-end systems, you can use the XMLTABLE function to convert XML data into relational views, while you can develop new applications on the XML data.
- Second, pureXML can help develop applications that handle versatile schemas that change frequently, and also help develop end-user customizable applications, which is particularly important for ISVs and IT service departments in large enterprises, where you frequently need to adjust the applications for end-users for diverse information, such as product specification and customer information.
- The third application scenario is to process sparse attribute values, such as medical records, or forms, where there are many fields overall, but only a few of them are applicable for each case. If you use relational approach, you would have to use a table very wide, with many many columns, but most of the column data are null. And you can use one XML column to handle that because XML can hold as many items as you want in a single document.
- The next usage is for object persistence. If you want to persist objects in an application, if you use a relational database, you would usually normalize objects into multiple tables, it's like you disassemble a car when you park in your garage overnight, certainly not convenient. Now with XML, it's much more flexible, you can use a single column to contain at least hiearchical data. More importantly, you can create indexes on these persistent objects. Unlike LOBs, you have no way to do that. So that makes XML for persistent data much more efficient when you need to search. Do you ever have an experience that you wish you could store an array into a single column yet you still can search it? Now you can achieve that using XML. In an extreme case, you could design all the tables with a primary key and a single XML column.
- Yet another application scenario is to migrate from legacy data models, such as network or hieararchical data model. If you migrate a hiearchical data model to a relational model, you need to introduce artificial keys, but with XML you don't need that, it's straight-forward, and you will have benefit of a high-level declarative query language.
- Next application scenario is to generate Web pages because you can use XHTML for web pages that can be generated directly from an SQL/XML statement.
- Last but not least, you can develop web services using DB2 applications to provide or consume web services directly because the web services use XML data. XML support in DB2 enables end-to-end XML solutions in an SOA environment.
In summary, DB2 pureXML makes XML data consumable, and provides the following business value:
- First, it will accelerate application development, reduce system complexity, and improve developer productivity. This will lead to improved time-to-market and reduce IT backlog.
- Second, it increases business agility, it will be able to help develop end-user customizable applications, easily accommodate the changes to data and schemas, and update applications rapidly and reduce the maintenance cost.
- Third, it can improve the business insight, help you develop applications that access information in otherwise unexploited documents, including for business intelligence and business monitoring.
- Last, you no longer need to store XML in a separate system, you can consolidate system resources onto System z, to reduce floor spaces, lower the energy consumption and people cost, and also use specialty engines for XML processing with low CPU cost, and increase security of critical XML data and simplify regulatory compliance.
If you have projects that you feel will benefit from pureXML, I urge youto consider starting with a pureXML PoT I mentioned last week.
The views expressed here are mine and do not necessarily reflect IBM's official position. Contact me at gzhang at us.ibm.com if you'd like a private conversation on the topic.