XML support in DB2 utilities
You can use IBM® DB2® for z/OS® utilities on XML objects. The utilities handle XML objects similar to the way that they handle LOB objects. For some utilities, you need to specify certain XML keywords.
- CHECK DATA
In addition to checking LOB relationships, the CHECK DATA utility also checks XML relationships.
CHECK DATA can check the consistency between a base table space and the corresponding XML table spaces.
In addition, if you specify the INCLUDE XML TABLESPACES option, CHECK DATA can check the structural integrity of XML documents. CHECK DATA can verify the following items for XML objects:
- All rows in an XML column exist in the XML table space.
- All documents in the XML table space are structurally valid.
- Each index entry in the node ID index has a corresponding XML document.
- Each XML document in the XML table space has corresponding entries in the node ID index.
- Each entry in the document ID column in the base table space has a corresponding entry in the node ID index over the XML table space.
- Each entry in the node ID index contains a corresponding value in the document ID column.
- Each value in the document ID column has a corresponding entry in the document ID index.
- Each entry in the document ID index has a corresponding value in the document ID column.
- If an XML column has an XML type modifier, all XML documents in the column are valid with respect to at least one XML schema that is associated with the XML type modifier.
If the base table space is not consistent with any related XML table spaces, or a problem is found during any of the previously listed checks, CHECK DATA reports the error.
For XML checking, the default behavior of CHECK DATA is to check only the consistency between each XML column and its node ID index. However, you can modify the scope of checking by specifying combinations of the CHECK DATA SCOPE keyword and the INCLUDE XML TABLESPACES keyword. The following table lists keyword combinations and the types of checks that are performed.Table 1. CHECK DATA SCOPE and INCLUDE XML TABLESPACES keywords that control the scope of XML checking Scope of XML checking SCOPE keyword INCLUDE XML TABLESPACES keyword Consistency of XML base table column and node ID index only SCOPE AUXONLY, SCOPE ALL, or SCOPE PENDING Not specified All XML checking SCOPE ALL or SCOPE PENDING INCLUDE XML TABLESPACES All XML checking except XML schema validation SCOPE ALL or SCOPE PENDING INCLUDE XML TABLESPACES XMLSCHEMA XML schema validation only SCOPE XMLSCHEMAONLY INCLUDE XML TABLESPACES For example, table space DSNXDX1A.DSNXSX1D contains a table named XMLTBL with XML column XMLCOL, which has an XML type modifier. If you specify the following statement, CHECK DATA checks LOB relationships, the base table space, XML relationships, and the structural integrity of XML documents for column XMLCOL, and does XML schema validation on the documents for column XMLCOL:
If you specify the following statement, CHECK DATA checks LOB relationships, the base table space, XML relationships, and the structural integrity of XML documents for column XMLCOL, but does not do XML schema validation on the documents for column XMLCOL:CHECK DATA TABLESPACE DSNXDX1A.DSNXSX1D INCLUDE XML TABLESPACES(TABLE XMLTBL XMLCOLUMN XMLCOL)
CHECK DATA TABLESPACE DSNXDX1A.DSNXSX1D INCLUDE XML TABLESPACES(TABLE XMLTBL XMLCOLUMN XMLCOL) XMLSCHEMA
You can also specify the action that DB2 performs when it finds an error in one of these columns by specifying one of the following appropriate keywords:
Table 2. CHECK DATA error keywords Column in error Action that CHECK DATA takes Keyword XML column Report the error only XMLERROR REPORT Report the error, set the column in error to an invalid status, and delete the invalid documents in the XML table space XMLERROR INVALIDATE LOB column Report the error only LOBERROR REPORT Report the error and set the column in error to an invalid status LOBERROR INVALIDATE XML or LOB column Report the error only AUXERROR REPORT Report the error and set the column in error to an invalid status AUXERROR INVALIDATE For example, the following statement specifies that CHECK DATA is to check XML and LOB relationships. DB2 is to report any LOB errors and XML errors, and set any XML columns in error to an invalid status.CHECK DATA TABLESPACE DSNXDX1A.DSNXSX1D SCOPE AUXONLY LOBERROR REPORT XMLERROR INVALIDATE
- CHECK INDEX
You can use the CHECK INDEX utility to check XML indexes, document ID indexes, and node ID indexes. You do not need to specify any additional keywords.
- COPY
You can use the COPY utility to copy XML objects. You do not need to specify any additional keywords. When you specify that DB2 is to copy a table space with XML columns, DB2 does not automatically copy any related XML table spaces or indexes. You must explicitly specify the XML objects that you want to copy.
For example, the following statement specifies that DB2 is to copy base table space DB1.BASETS1 and the XML table space DB1.XBAS0001.//COPYX EXEC DSNUPROC,SYSTEM=DSN //SYSIN DD * TEMPLATE A DSN(&DB..&SP..COPY1) UNIT CART STACK YES COPY TABLESPACE DB1.BASETS1 COPYDDN(A) TABLESPACE DB1.XBAS0001 COPYDDN(A)
- COPYTOCOPY
You can use the COPYTOCOPY utility to copy existing copies of the XML objects. You do not need to specify any additional keywords.
- EXEC SQL
You cannot declare a cursor that includes XML data. Thus, you cannot use the DB2 UDB family cross-loader function to transfer data in XML columns. However, you can declare a cursor on a table with XML columns if the cursor does not include any XML columns.
For example, suppose that you create the following table with an XML column:
You cannot declare the following cursor, because it includes XML data in the PURCHASE_ORDER column:CREATE TABLE ORDERS (ORDERNO INTEGER, PURCHASE_ORDER XML);
However, you can declare a cursor that includes non-XML columns, as in the following example:EXEC SQL DECLARE C1 CURSOR FOR SELECT * FROM ORDERS ENDEXEC
EXEC SQL DECLARE C2 CURSOR FOR SELECT ORDERNO FROM ORDERS ENDEXEC
- LISTDEF
- When you create object lists with the LISTDEF utility, you can specify whether you want related XML objects to be included or excluded. Use the following keywords to indicate the objects to include or exclude:
- ALL
- Base, LOB, and XML objects (This keyword is the default.)
- BASE
- Base objects only
- LOB
- LOB objects only.
- XML
- XML objects only.
For example, the LISTDEF statements in the following table generate the indicated lists.Table 3. Example LISTDEF statements LISTDEF statement Objects that are included in the list LISTDEF LISTALL INCLUDE TABLESPACES DATABASE ACCOUNTS INCLUDE INDEXSPACES DATABASE ACCOUNTS
- All tables spaces in the ACCOUNTS database, including XML and LOB table spaces
- All index spaces in the ACCOUNTS database
LISTDEF LISTXML INCLUDE TABLESPACES DATABASE ACCOUNTS XML INCLUDE INDEXSPACES DATABASE ACCOUNTS XML
- All XML table spaces in the ACCOUNTS database
- All XML index spaces in the ACCOUNTS database
LISTDEF LIST INCLUDE TABLESPACES DATABASE ACCOUNTS ALL INCLUDE INDEXSPACES DATABASE ACCOUNTS ALL EXCLUDE INDEXSPACES DATABASE ACCOUNTS XML
- All tables spaces in the ACCOUNTS database, including XML and LOB table spaces
- All index spaces the ACCOUNTS database except for XML index spaces
- LOAD
You can use the LOAD utility to load XML data.
The input data can be in the textual XML format or the binary XML format (Extensible Dynamic Binary XML DB2 Client/Server Binary XML Format). Binary XML input data must be in the non-delimited format.
If you load data into an XML column that has an XML type modifier, the LOAD utility validates the input data according to the XML schema that is specified in the XML type modifier. If LOAD detects an XML schema violation for a row, it deletes the row and issues an error message.
The steps for loading XML data are similar to the steps for loading other types of data, except that you need to also perform the following actions:- In the input data set:
- If the data set is in delimited format, ensure that the XML input fields follow the standard LOAD utility delimited format.
- If the data set is not in delimited format, specify the XML input fields similar to the way that you specify VARCHAR input. Specify the length of the field in a 2-byte binary field that precedes the data.
- In the LOAD statement:
- Specify the keyword XML for all input fields of type XML.
- If you want the whitespace to be preserved in the XML data, also
specify the keywords PRESERVE WHITESPACE. By default, LOAD strips
the whitespace.
When data in the binary XML format is loaded into a table, and PRESERVE WHITESPACE is not specified, DB2 strips whitespace only when the input data contains whitespace tags.
For example, the following LOAD statement specifies that DB2 is to load data from the MYSYSREC data set into the PRODUCTS table:
Assume that the MYSYSREC data set logically contains the following data: (This example is a logical representation and is not intended to show how the data actually looks in the input data set.)LOAD DATA INDDN(MYSYSREC) INTO TABLE PRODUCTS (CATEGORY POSITION (1) CHAR(8), PURCHASE_ORDER POSITION (10) XML PRESERVE WHITESPACE)
After loading this data, the PRODUCTS table contains the following information:Shovel 339<product pid="100-100-01" xmlns="http://podemo.org"> <description> <name>Snow Shovel, Basic 22"</name> <details>Basic Snow Shovel, 22" wide, straight handle with D-Grip<details> <price>9.99</price> <weight>1 kg<weight> </description> <product> Shovel 358<product pid="100-101-01" xmlns="http://podemo.org"> <description> <name>Snow Shovel, Deluxe 24"</name> <details>A Deluxe Snow Shovel, 24 inches wide, ergonomic curved handle with D-Grip<details> <price>19.99</price> <weight>2 kg<weight> </description> <product>
Table 4. PRODUCTS table DB2_GENERATED _document ID_FOR_XML1 CATEGORY PURCHASE_ORDER 1 Shovel <product pid="100-100-01" xmlns="http://podemo.org">
<description>
<name>Snow Shovel, Basic 22"</name>
<details>Basic Snow Shovel, 22" wide, straight handle
with D-Grip</details>
<price>9.99</price>
<weight>1 kg<weight>
</description>
<product>2 Shovel <product pid="100-101-01" xmlns="http://podemo.org">
<description>
<name>Snow Shovel, Deluxe 24"</name>
<details>A Deluxe Snow Shovel, 24 inches wide,
ergonomic curved handle with D-Grip</details>
<price>19.99</price>
<weight>2 kg<weight>
</description>
<product>Note:Loading XML data with the LOAD utility has the following restrictions:- DB2 automatically generates the document ID column for each row that is loaded into a table with at least one XML column. The document ID column is partially hidden. It is not included in the result set of a SELECT * statement. However, you can query this column by name and view information about this column and its index in the catalog. Several utilities report information on this column in their output.
- You cannot specify that XML input fields be loaded into non-XML columns, such as CHAR or VARCHAR columns.
- DB2 does not perform any specified compression until the next time that you run the REORG utility on this data.
- DB2 ignores any specified FREEPAGE and PCTFREE values until the next time that you run the REORG utility on this data.
- If you specify PREFORMAT, DB2 preformats the base table space, but not the XML table space.
- You cannot directly load the document ID column of the base table space.
- You cannot specify a default value for an XML column.
- You cannot load XML values that are greater than 32 KB. To load such values, use file reference variables in the LOAD utility, or use applications with SQL XML AS file reference variables.
- In the input data set:
- QUIESCE
When you specify QUIESCE TABLESPACESET, the table space set includes related XML objects. You do not have to specify any additional keywords in the QUIESCE statement.
- REBUILD INDEX
You can use the REBUILD INDEX utility to rebuild XML indexes, document ID indexes, and node ID indexes. You do not need to specify any additional keywords in the REBUILD INDEX statement. REBUILD INDEX with SHRLEVEL CHANGE is not valid for XML indexes.
- RECOVER
You can use the RECOVER utility to recover XML objects. You do not need to specify any additional keywords in the RECOVER statement.
When you recover an XML index or table space to a point in time, you should recover all related objects to the same point in time. Related objects include XML objects, LOB objects, and referentially related objects. If you do not recover all related objects to the same point in time, one or more objects might be placed in a restrictive state.
- REORG INDEX
You can use the REORG utility to reorganize XML indexes. When you specify that you want XML indexes to be reorganized, you must also specify the WORKDDN keyword and provide the specified temporary work file. The default is SYSUT1.
- REORG TABLESPACE
You can use the REORG TABLESPACE utility to reorganize XML objects. You do not need to specify any additional keywords in the REORG statement.
When you specify the name of the base table space in the REORG statement, DB2 reorganizes only that table space and not any related XML objects. If you want DB2 to reorganize the XML objects, you must specify those object names.
When you run REORG on an XML table space that supports XML versions, REORG discards rows for versions of an XML document that are no longer needed.
For XML table spaces and base table spaces with XML columns, you cannot specify the following options in the REORG statement:- DISCARD
- REBALANCE
- UNLOAD EXTERNAL
- REPAIR
You can use the REPAIR utility on XML objects.
You can use the REPAIR utility to:- Set the status of an XML column to invalid.
- Delete a corrupted XML document and its node ID index entries.
The most common use for the REPAIR utility for XML objects is to take corrective action after you run CHECK DATA with SHRLEVEL CHANGE on a table space with XML columns. CHECK DATA with SHRLEVEL CHANGE operates on shadow data sets, so it does not modify XML columns or XML table spaces. Instead, CHECK DATA generates REPAIR statements that you can run to delete invalid XML documents and to mark the corresponding XML columns as invalid.
- REPORT
- When you specify REPORT TABLESPACESET, the output report includes XML objects in the list of members in the table space set. The following sample output shows a table space set for a table that contains a LOB column and an XML column:
TABLESPACE SET REPORT: TABLESPACE : DBDKCX.TS0001 TABLE : SYSADM.DKCTEST INDEXSPACE : DBDKCX.IRdocument IDD INDEX : SYSADM.I_document IDDKCTEST XML TABLESPACE SET REPORT: TABLESPACE : DBDKCX.TS0001 BASE TABLE : SYSADM.DKCTEST COLUMN : COL2 XML TABLESPACE : DBDKCX.XDKC0000 XML TABLE : SYSADM.XDKCTEST XML NODEID INDEXSPACE: DBDKCX.IRNODEID XML NODEID INDEX : SYSADM.I_NODEIDXDKCTEST XML INDEXSPACE : DBDKCX.VALUES XML INDEX : SYSADM.VALUES
- RUNSTATS
You can use the RUNSTATS utility to gather statistics for XML objects.
RUNSTATS ignores the following keywords for XML tables and XML indexes:- COLGROUP
- FREQVAL MOST|LEAST|BOTH
- HISTOGRAM
RUNSTATS INDEX ignores the following keywords for XML indexes:- KEYCARD
- FREQVAL MOST|LEAST|BOTH
- HISTOGRAM
XML indexes are related to XML tables, and not to the associated base tables. If you specify a base table space and an XML index in the same RUNSTATS control statement, DB2 generates an error. When you run RUNSTATS against a base table, RUNSTATS collects statistics only for indexes on the base table, including the document ID index.
- UNLOAD
You can use the UNLOAD utility to unload XML data.
The output data can be in the textual XML format or the binary XML format. Data that is unloaded can be in the delimited or non-delimited format.
When data is unloaded in the binary XML format, UNLOAD does not add whitespace tags.
In the UNLOAD statement, specify the base table space. (You do not have to specify the XML table space.) Also specify the XML keyword in the field specifications for the XML columns.
For example, the following UNLOAD statement specifies that DB2 is to unload data from the XMLSAMP table into the SYSREC data set in delimited format.
Assume that the table contains the data in Table 4. The output, delimited data is://STEP3 EXEC DSNUPROC,UID='JUQBU105.UNLD1', // UTPROC='', // SYSTEM='SSTR' //UTPRINT DD SYSOUT=* //SYSREC DD DSN=JUQBU105.UNLD1.STEP3.TBQB0501,DISP=(MOD,DELETE,CATLG), // UNIT=SYSDA,SPACE=(4000,(20,20),,,ROUND) //SYSPUNCH DD DSN=JUQBU105.UNLD1.STEP3.SYSPUNCH // DISP=(MOD,CATLG,CATLG) // UNIT=SYSDA,SPACE=(4000,(20,20),,,ROUND) //SYSIN DD* UNLOAD TABLESPACE DBQB0501.XMLSAMP DELIMITED CHARDEL X'22' COLDEL X'2C' DECPT X'2E' FROM TABLE ADMF001.BASETBL (CATEGORY POSITION(*) CHAR, PURCHASE_ORDER POSITION(*) XML) UNICODE /*
Shovel,"<product pid=""100-100-01"" xmlns="http://podemo.org""> <description> <name>Snow Shovel, Basic 22""</name> <details>Basic Snow Shovel, 22"" wide, straight handle with D-Grip<details> <price>9.99</price> <weight>1 kg<weight> </description> <product>" Shovel,"<product pid=""100-101-01"" xmlns="http://podemo.org""> <description> <name>Snow Shovel, Deluxe 24""</name> <details>A Deluxe Snow Shovel, 24 inches wide, ergonomic curved handle with D-Grip<details> <price>19.99</price> <weight>2 kg<weight> </description> <product>"
For maximum portability, specify UNICODE in the UNLOAD statement and use Unicode delimiter characters. If XML columns are not being unloaded in UTF-8 CCSID 1208, the unloaded column values are prefixed with a standard XML encoding declaration that specifies the encoding that is used.
Unloading XML data with the UNLOAD utility has the following restrictions:- You cannot specify that the XML data be converted to another data type, such as CHAR or VARCHAR.
- You cannot unload XML data from a copy.
- Inline statistics and inline copies
- When you request that a utility gathers statistics or make a copy inline, the following restrictions apply:
- You cannot take inline copies of XML table spaces.
- When you request inline statistics or inline copies for the base table space, DB2 does not take copies or gather statistics for any related XML table spaces. You must explicitly specify that you want statistics gathered for any XML table spaces.
- When you request that inline statistics be collected for an XML
index, you cannot specify the following statistic keywords:
- HISTOGRAM
- KEYCARD
- FREQVAL NUMCOLS COUNT
- Stand-alone utilities
Stand-alone utilities have no specific options to support XML data. However, you can use stand-alone utilities on XML data.
The DSN1COPY utility has the following restriction on use with XML data:
- Do not use DSN1COPY to copy XML table spaces from one subsystem to another.