You've been able to store XML data in DB2 for quite some time. Of course, you can store the object as a CLOB and, with the XML extender; you can shred the document into relational tables that let you efficiently access subcomponents of an XML document with a query. But each of these methods has a disadvantage. Shredded documents can result in loss of document fidelity and make it difficult to change the XML schema. The biggest benefit of XML is that the schema is completely flexible, making it rigid by shredding it into relations is counterproductive. With CLOB you can keep the flexibility but every time you want to read components of the XML you need to parse the CLOB at runtime, so performance is poor.
DB2 9 introduces a completely new XML storage engine where XML data is stored hierarchically. XML is hierarchical in nature, so storing the XML hierarchically in the engine preserves fidelity, allows for flexible schema, and also delivers high-performance sub-document access. This new hierarchical storage engine sits inside the same DB2 data server as the relational engine, so now you can store customer information alongside their XML purchase orders and search all of the information efficiently.
DB2 10.1 continues adding more cool features to pureXML. New XML indices now more closely match your data, functional indices can speed up searches and queries, binary XML format enables faster data transmission, handling of casting errors is now aligned with SQL, performance improved for certain XML queries, XML type is now allowed in global variables and also in compiled SQL functions.
XML is stored inside DB2 in a hierarchical format. XML itself is hierarchical starting from the root tag (or node) and then traversing through the XML string or document. In DB2 the XML is stored inside of data pages in this hierarchical structure. If the XML data is larger than a single data page, the XML tree is broken up into subtrees with each subtree stored on a data page and the pages linked together.
To create a table with XML data simply run the command
create table table_name (col1 data_type, ..., xml_col_name XML).
This allows you to create the table with your relational columns as you would today, and for your XML information, you just assign the column a data type of XML. Now you can store the XML data in that column.
Creating an index is similar to creating a typical index on relational data with the exception that you are not indexing a column but rather a component of the XML schema defined in the above xml_column_name column. The syntax would look something like the code below:
create index index_name on table_name (xml_column_name) generate key using xmlpattern '/po/purchaser/@pname' as sql varchar(50)
Starting with DB2 10.1, you can create functional XML indices using the
functions. Indices created using
speed up case-insensitive searches of XML data. Indices created using
fn:exists can speed up queries that search for specific elements or for the lack of
specific elements. For instance, in our previous example, if you need to speed
up case-insensitive searches over the
attribute, you'd use
fn:upper-case in your index as
create index index_name on table_name (xml_column_name) generate key using xmlpattern '/po/purchaser/@pname/fn:upper-case(.)' as sql varchar(50)
You can now create indices of type DECIMAL and INTEGER over XML data. In situations where your numeric data is of either INTEGER or DECIMAL type, indices created as DECIMAL and INTEGER values can potentially provide faster query response times. In previous releases, DOUBLE was the only supported numeric type for XML indexes. Consider the following example using an index of type integer.
CREATE INDEX intidx on favorite_cds(cdinfo) GENERATE KEYS USING XMLPATTERN '/favoritecds/cd/year' AS SQL INTEGER
The following SQL data types are supported: VARCHAR, DATE, TIMESTAMP, INTEGER, DECIMAL, and DOUBLE.
The new binary XML format provides a faster way to transmit and receive XML data between certain Java™ pureXML applications and a DB2 server Version 10.1. For these Java applications, unnecessary XML parsing costs are eliminated, therefore improving performance.
Binary XML data refers to data that is in the Extensible Dynamic Binary XML DB2
Binary XML Format (XDBX). Only Version 4.9 or later releases
of the IBM Data Server Driver for JDBC and SQLJ can send or retrieve XML data
to/from the data server in XDBX format. You have the full control over
the XML format you use for transmission using the
property either on the data source level or the connection properties.
Binary XML format is most efficient for cases in which the input or output data is in a non-textual representation, such as SAX, StAX, or DOM. For example, these methods retrieve XML data in non-textual representations.