XML support in DB2 utilities

You can use IBM® DB2® for z/OS® utilities on XML objects. The utilities handle XML objects similar to the way that they handle LOB objects. For some utilities, you need to specify certain XML keywords.

CHECK DATA

In addition to checking LOB relationships, the CHECK DATA utility also checks XML relationships.

CHECK DATA can check the consistency between a base table space and the corresponding XML table spaces.

Start of changeIn addition, if you specify the INCLUDE XML TABLESPACES option, CHECK DATA can check the structural integrity of XML documents. CHECK DATA can verify the following items for XML objects:End of change

Start of change
  • All rows in an XML column exist in the XML table space.
  • All documents in the XML table space are structurally valid.
  • Each index entry in the node ID index has a corresponding XML document.
  • Each XML document in the XML table space has corresponding entries in the node ID index.
  • Each entry in the document ID column in the base table space has a corresponding entry in the node ID index over the XML table space.
  • Each entry in the node ID index contains a corresponding value in the document ID column.
  • Each value in the document ID column has a corresponding entry in the document ID index.
  • Each entry in the document ID index has a corresponding value in the document ID column.
  • If an XML column has an XML type modifier, all XML documents in the column are valid with respect to at least one XML schema that is associated with the XML type modifier.
End of change

Start of changeIf the base table space is not consistent with any related XML table spaces, or a problem is found during any of the previously listed checks, CHECK DATA reports the error.End of change

Start of changeFor XML checking, the default behavior of CHECK DATA is to check only the consistency between each XML column and its node ID index. However, you can modify the scope of checking by specifying combinations of the CHECK DATA SCOPE keyword and the INCLUDE XML TABLESPACES keyword. The following table lists keyword combinations and the types of checks that are performed.
Table 1. CHECK DATA SCOPE and INCLUDE XML TABLESPACES keywords that control the scope of XML checking
Scope of XML checking SCOPE keyword INCLUDE XML TABLESPACES keyword
Consistency of XML base table column and node ID index only SCOPE AUXONLY, SCOPE ALL, or SCOPE PENDING Not specified
All XML checking SCOPE ALL or SCOPE PENDING INCLUDE XML TABLESPACES
Start of changeAll XML checking except XML schema validationEnd of change Start of changeSCOPE ALL or SCOPE PENDINGEnd of change Start of changeINCLUDE XML TABLESPACES XMLSCHEMAEnd of change
XML schema validation only SCOPE XMLSCHEMAONLY INCLUDE XML TABLESPACES
End of change
For example, table space DSNXDX1A.DSNXSX1D contains a table named XMLTBL with XML column XMLCOL, which has an XML type modifier. If you specify the following statement, CHECK DATA checks LOB relationships, the base table space, XML relationships, and the structural integrity of XML documents for column XMLCOL, and does XML schema validation on the documents for column XMLCOL:
CHECK DATA TABLESPACE DSNXDX1A.DSNXSX1D 
 INCLUDE XML TABLESPACES(TABLE XMLTBL XMLCOLUMN XMLCOL)
If you specify the following statement, CHECK DATA checks LOB relationships, the base table space, XML relationships, and the structural integrity of XML documents for column XMLCOL, but does not do XML schema validation on the documents for column XMLCOL:Start of change
CHECK DATA TABLESPACE DSNXDX1A.DSNXSX1D 
 INCLUDE XML TABLESPACES(TABLE XMLTBL XMLCOLUMN XMLCOL) XMLSCHEMA
End of change

You can also specify the action that DB2 performs when it finds an error in one of these columns by specifying one of the following appropriate keywords:

Table 2. CHECK DATA error keywords
Column in error Action that CHECK DATA takes Keyword
XML column Report the error only XMLERROR REPORT
Report the error, set the column in error to an invalid status, and delete the invalid documents in the XML table space XMLERROR INVALIDATE
LOB column Report the error only LOBERROR REPORT
Report the error and set the column in error to an invalid status LOBERROR INVALIDATE
XML or LOB column Report the error only AUXERROR REPORT
Report the error and set the column in error to an invalid status AUXERROR INVALIDATE
For example, the following statement specifies that CHECK DATA is to check XML and LOB relationships. DB2 is to report any LOB errors and XML errors, and set any XML columns in error to an invalid status.
CHECK DATA TABLESPACE DSNXDX1A.DSNXSX1D
                 SCOPE AUXONLY  
      LOBERROR REPORT
      XMLERROR INVALIDATE
CHECK INDEX

You can use the CHECK INDEX utility to check XML indexes, document ID indexes, and node ID indexes. You do not need to specify any additional keywords.

COPY

You can use the COPY utility to copy XML objects. You do not need to specify any additional keywords. When you specify that DB2 is to copy a table space with XML columns, DB2 does not automatically copy any related XML table spaces or indexes. You must explicitly specify the XML objects that you want to copy.

For example, the following statement specifies that DB2 is to copy base table space DB1.BASETS1 and the XML table space DB1.XBAS0001.
//COPYX    EXEC DSNUPROC,SYSTEM=DSN 
//SYSIN    DD *   
           TEMPLATE A DSN(&DB..&SP..COPY1) UNIT CART STACK YES
COPY      TABLESPACE DB1.BASETS1 COPYDDN(A)       
          TABLESPACE DB1.XBAS0001 COPYDDN(A)       
COPYTOCOPY

You can use the COPYTOCOPY utility to copy existing copies of the XML objects. You do not need to specify any additional keywords.

EXEC SQL

You cannot declare a cursor that includes XML data. Thus, you cannot use the DB2 UDB family cross-loader function to transfer data in XML columns. However, you can declare a cursor on a table with XML columns if the cursor does not include any XML columns.

For example, suppose that you create the following table with an XML column:
CREATE TABLE ORDERS 
  (ORDERNO INTEGER,
   PURCHASE_ORDER XML);
You cannot declare the following cursor, because it includes XML data in the PURCHASE_ORDER column:
EXEC SQL
  DECLARE C1 CURSOR FOR SELECT * FROM ORDERS
ENDEXEC
However, you can declare a cursor that includes non-XML columns, as in the following example:
EXEC SQL
  DECLARE C2 CURSOR FOR SELECT ORDERNO FROM ORDERS
ENDEXEC
LISTDEF
When you create object lists with the LISTDEF utility, you can specify whether you want related XML objects to be included or excluded. Use the following keywords to indicate the objects to include or exclude:
ALL
Base, LOB, and XML objects (This keyword is the default.)
BASE
Base objects only
LOB
LOB objects only.
XML
XML objects only.
For example, the LISTDEF statements in the following table generate the indicated lists.
Table 3. Example LISTDEF statements
LISTDEF statement Objects that are included in the list
LISTDEF LISTALL INCLUDE TABLESPACES DATABASE ACCOUNTS
                INCLUDE INDEXSPACES DATABASE ACCOUNTS
  • All tables spaces in the ACCOUNTS database, including XML and LOB table spaces
  • All index spaces in the ACCOUNTS database
LISTDEF LISTXML INCLUDE TABLESPACES DATABASE ACCOUNTS XML
                INCLUDE INDEXSPACES DATABASE ACCOUNTS XML
  • All XML table spaces in the ACCOUNTS database
  • All XML index spaces in the ACCOUNTS database
LISTDEF LIST   INCLUDE TABLESPACES DATABASE ACCOUNTS ALL
               INCLUDE INDEXSPACES DATABASE ACCOUNTS ALL
               EXCLUDE INDEXSPACES DATABASE ACCOUNTS XML
  • All tables spaces in the ACCOUNTS database, including XML and LOB table spaces
  • All index spaces the ACCOUNTS database except for XML index spaces
LOAD

You can use the LOAD utility to load XML data.

Start of changeThe input data can be in the textual XML format or the binary XML format (Extensible Dynamic Binary XML DB2 Client/Server Binary XML Format). Binary XML input data must be in the non-delimited format.End of change

Start of changeIf you load data into an XML column that has an XML type modifier, the LOAD utility validates the input data according to the XML schema that is specified in the XML type modifier. If LOAD detects an XML schema violation for a row, it deletes the row and issues an error message.End of change

The steps for loading XML data are similar to the steps for loading other types of data, except that you need to also perform the following actions:
  • In the input data set:
    • If the data set is in delimited format, ensure that the XML input fields follow the standard LOAD utility delimited format.
    • If the data set is not in delimited format, specify the XML input fields similar to the way that you specify VARCHAR input. Specify the length of the field in a 2-byte binary field that precedes the data.
  • In the LOAD statement:
    • Specify the keyword XML for all input fields of type XML.
    • If you want the whitespace to be preserved in the XML data, also specify the keywords PRESERVE WHITESPACE. By default, LOAD strips the whitespace.

      Start of changeWhen data in the binary XML format is loaded into a table, and PRESERVE WHITESPACE is not specified, DB2 strips whitespace only when the input data contains whitespace tags.End of change

For example, the following LOAD statement specifies that DB2 is to load data from the MYSYSREC data set into the PRODUCTS table:
LOAD DATA INDDN(MYSYSREC)
 INTO TABLE PRODUCTS
  (CATEGORY       POSITION  (1) CHAR(8),
   PURCHASE_ORDER POSITION  (10) XML PRESERVE WHITESPACE)
Assume that the MYSYSREC data set logically contains the following data: (This example is a logical representation and is not intended to show how the data actually looks in the input data set.)
Shovel 339<product pid="100-100-01" xmlns="http://podemo.org"> 
            <description> 
                 <name>Snow Shovel, Basic 22"</name>
                 <details>Basic Snow Shovel, 22" wide, straight handle with D-Grip<details>
                 <price>9.99</price>
                 <weight>1 kg<weight>
            </description>
         <product>

Shovel  358<product pid="100-101-01" xmlns="http://podemo.org"> 
            <description> 
                 <name>Snow Shovel, Deluxe 24"</name>
                 <details>A Deluxe Snow Shovel, 24 inches wide, ergonomic curved handle with 
                 D-Grip<details>
                 <price>19.99</price>
                 <weight>2 kg<weight>
            </description>
         <product>
After loading this data, the PRODUCTS table contains the following information:
Table 4. PRODUCTS table
DB2_GENERATED _document ID_FOR_XML1 CATEGORY PURCHASE_ORDER
1 Shovel
<product pid="100-100-01" xmlns="http://podemo.org">
   <description>
       <name>Snow Shovel, Basic 22"</name>
      <details>Basic Snow Shovel, 22" wide, straight handle
          with D-Grip</details>
       <price>9.99</price>
       <weight>1 kg<weight>
   </description>
<product>
2 Shovel
<product pid="100-101-01" xmlns="http://podemo.org">
   <description>
       <name>Snow Shovel, Deluxe 24"</name>
      <details>A Deluxe Snow Shovel, 24 inches wide,  
        ergonomic curved handle with D-Grip</details>
       <price>19.99</price>
       <weight>2 kg<weight>
   </description>
<product>
Note:
  1. DB2 automatically generates the document ID column for each row that is loaded into a table with at least one XML column. The document ID column is partially hidden. It is not included in the result set of a SELECT * statement. However, you can query this column by name and view information about this column and its index in the catalog. Several utilities report information on this column in their output.
Loading XML data with the LOAD utility has the following restrictions:
  • You cannot specify that XML input fields be loaded into non-XML columns, such as CHAR or VARCHAR columns.
  • DB2 does not perform any specified compression until the next time that you run the REORG utility on this data.
  • DB2 ignores any specified FREEPAGE and PCTFREE values until the next time that you run the REORG utility on this data.
  • If you specify PREFORMAT, DB2 preformats the base table space, but not the XML table space.
  • You cannot directly load the document ID column of the base table space.
  • You cannot specify a default value for an XML column.
  • You cannot load XML values that are greater than 32 KB. To load such values, use file reference variables in the LOAD utility, or use applications with SQL XML AS file reference variables.
QUIESCE

When you specify QUIESCE TABLESPACESET, the table space set includes related XML objects. You do not have to specify any additional keywords in the QUIESCE statement.

REBUILD INDEX

You can use the REBUILD INDEX utility to rebuild XML indexes, document ID indexes, and node ID indexes. You do not need to specify any additional keywords in the REBUILD INDEX statement. REBUILD INDEX with SHRLEVEL CHANGE is not valid for XML indexes.

RECOVER

You can use the RECOVER utility to recover XML objects. You do not need to specify any additional keywords in the RECOVER statement.

When you recover an XML index or table space to a point in time, you should recover all related objects to the same point in time. Related objects include XML objects, LOB objects, and referentially related objects. If you do not recover all related objects to the same point in time, one or more objects might be placed in a restrictive state.

REORG INDEX

You can use the REORG utility to reorganize XML indexes. When you specify that you want XML indexes to be reorganized, you must also specify the WORKDDN keyword and provide the specified temporary work file. The default is SYSUT1.

REORG TABLESPACE

You can use the REORG TABLESPACE utility to reorganize XML objects. You do not need to specify any additional keywords in the REORG statement.

When you specify the name of the base table space in the REORG statement, DB2 reorganizes only that table space and not any related XML objects. If you want DB2 to reorganize the XML objects, you must specify those object names.

Start of changeWhen you run REORG on an XML table space that supports XML versions, REORG discards rows for versions of an XML document that are no longer needed.End of change

For XML table spaces and base table spaces with XML columns, you cannot specify the following options in the REORG statement:
  • DISCARD
  • REBALANCE
  • UNLOAD EXTERNAL
REPAIR

You can use the REPAIR utility on XML objects.

Start of changeYou can use the REPAIR utility to:
  • Set the status of an XML column to invalid.
  • Delete a corrupted XML document and its node ID index entries.
End of change

Start of changeThe most common use for the REPAIR utility for XML objects is to take corrective action after you run CHECK DATA with SHRLEVEL CHANGE on a table space with XML columns. CHECK DATA with SHRLEVEL CHANGE operates on shadow data sets, so it does not modify XML columns or XML table spaces. Instead, CHECK DATA generates REPAIR statements that you can run to delete invalid XML documents and to mark the corresponding XML columns as invalid.End of change

REPORT
When you specify REPORT TABLESPACESET, the output report includes XML objects in the list of members in the table space set. The following sample output shows a table space set for a table that contains a LOB column and an XML column:
TABLESPACE SET REPORT:                              
                                                    
TABLESPACE         : DBDKCX.TS0001                  
  TABLE            : SYSADM.DKCTEST                 
  INDEXSPACE       : DBDKCX.IRdocument IDD                
  INDEX            : SYSADM.I_document IDDKCTEST          
                                                    
                                                    
XML TABLESPACE SET REPORT:                          
                                                    
TABLESPACE                 : DBDKCX.TS0001          
                                                    
  BASE TABLE               : SYSADM.DKCTEST         
  COLUMN                   : COL2                   
    XML TABLESPACE         : DBDKCX.XDKC0000        
      XML TABLE            : SYSADM.XDKCTEST        
      XML NODEID INDEXSPACE: DBDKCX.IRNODEID        
      XML NODEID INDEX     : SYSADM.I_NODEIDXDKCTEST
      XML INDEXSPACE       : DBDKCX.VALUES          
      XML INDEX            : SYSADM.VALUES          
RUNSTATS

You can use the RUNSTATS utility to gather statistics for XML objects.

RUNSTATS ignores the following keywords for XML tables and XML indexes:
  • COLGROUP
  • FREQVAL MOST|LEAST|BOTH
  • HISTOGRAM
RUNSTATS INDEX ignores the following keywords for XML indexes:
  • KEYCARD
  • FREQVAL MOST|LEAST|BOTH
  • HISTOGRAM

XML indexes are related to XML tables, and not to the associated base tables. If you specify a base table space and an XML index in the same RUNSTATS control statement, DB2 generates an error. When you run RUNSTATS against a base table, RUNSTATS collects statistics only for indexes on the base table, including the document ID index.

UNLOAD

You can use the UNLOAD utility to unload XML data.

Start of changeThe output data can be in the textual XML format or the binary XML format. Data that is unloaded can be in the delimited or non-delimited format.End of change

Start of changeWhen data is unloaded in the binary XML format, UNLOAD does not add whitespace tags.End of change

In the UNLOAD statement, specify the base table space. (You do not have to specify the XML table space.) Also specify the XML keyword in the field specifications for the XML columns.

For example, the following UNLOAD statement specifies that DB2 is to unload data from the XMLSAMP table into the SYSREC data set in delimited format.
//STEP3    EXEC DSNUPROC,UID='JUQBU105.UNLD1',
//         UTPROC='',
//         SYSTEM='SSTR'
//UTPRINT  DD SYSOUT=*          
//SYSREC   DD DSN=JUQBU105.UNLD1.STEP3.TBQB0501,DISP=(MOD,DELETE,CATLG),
//         UNIT=SYSDA,SPACE=(4000,(20,20),,,ROUND) 
//SYSPUNCH DD DSN=JUQBU105.UNLD1.STEP3.SYSPUNCH
//         DISP=(MOD,CATLG,CATLG)
//         UNIT=SYSDA,SPACE=(4000,(20,20),,,ROUND)
//SYSIN    DD*          
  UNLOAD TABLESPACE DBQB0501.XMLSAMP       
         DELIMITED CHARDEL X'22' COLDEL X'2C' DECPT X'2E'
          FROM TABLE ADMF001.BASETBL
          (CATEGORY          POSITION(*) CHAR, 
           PURCHASE_ORDER    POSITION(*) XML) 
         UNICODE
 /*
Assume that the table contains the data in Table 4. The output, delimited data is:
Shovel,"<product pid=""100-100-01"" xmlns="http://podemo.org""> 
            <description> 
                 <name>Snow Shovel, Basic 22""</name>
                 <details>Basic Snow Shovel, 22"" wide, straight handle with D-Grip<details>
                 <price>9.99</price>
                 <weight>1 kg<weight>
            </description>
         <product>"

Shovel,"<product pid=""100-101-01"" xmlns="http://podemo.org""> 
            <description> 
                 <name>Snow Shovel, Deluxe 24""</name>
                 <details>A Deluxe Snow Shovel, 24 inches wide, ergonomic curved handle with 
                 D-Grip<details>
                 <price>19.99</price>
                 <weight>2 kg<weight>
            </description>
         <product>"

For maximum portability, specify UNICODE in the UNLOAD statement and use Unicode delimiter characters. If XML columns are not being unloaded in UTF-8 CCSID 1208, the unloaded column values are prefixed with a standard XML encoding declaration that specifies the encoding that is used.

Unloading XML data with the UNLOAD utility has the following restrictions:
  • You cannot specify that the XML data be converted to another data type, such as CHAR or VARCHAR.
  • You cannot unload XML data from a copy.
Inline statistics and inline copies
When you request that a utility gathers statistics or make a copy inline, the following restrictions apply:
  • You cannot take inline copies of XML table spaces.
  • When you request inline statistics or inline copies for the base table space, DB2 does not take copies or gather statistics for any related XML table spaces. You must explicitly specify that you want statistics gathered for any XML table spaces.
  • When you request that inline statistics be collected for an XML index, you cannot specify the following statistic keywords:
    • HISTOGRAM
    • KEYCARD
    • FREQVAL NUMCOLS COUNT
Stand-alone utilities

Stand-alone utilities have no specific options to support XML data. However, you can use stand-alone utilities on XML data.

The DSN1COPY utility has the following restriction on use with XML data:

  • Do not use DSN1COPY to copy XML table spaces from one subsystem to another.