Resolving indexing errors when loading XML data

Load operations that fail due to indexing errors can be resolved using the db2diag log file and the import utility together to identify and correct problem values in the XML data.

About this task

If a load operation returns the error message SQL20305N (sqlcode -20305), this indicates that one or more XML node values could not be indexed. The error message will output the reason code for the error. Enter ? SQL20305N in the command line processor to look up the explanation and user response for the corresponding reason code.

For indexing problems during insert operations, a generated XQuery statement is output to the db2diag log file to help locate the failing XML node values within the document. See Common XML indexing issues for details about how to use the XQuery statement to locate the failing XML node values.

For indexing problems during load operations, however, the generated XQuery statements are not output to the db2diag log file. To generate these XQuery statements the import utility must be run on the failing rows that were not loaded. Because the rejected rows do not exist in the table, the XQuery statements cannot be run on the failing documents. To solve this problem, a new table with the same definition must be created without any indexes. The failing rows can then be loaded into the new table, and the XQuery statements can then be run on the new table to locate the failing XML node values within the documents.

Perform the following steps to resolve the indexing errors:

Procedure

  1. Determine which rows were rejected during the load operation using the record numbers in the output information.
  2. Create a .del file containing only the rejected rows.
  3. Create a new table (for example, T2) with the same columns as the original table (T1). Do not create any indexes on the new table.
  4. Load the rejected rows into the new table T2.
  5. For each rejected row in the original table T1:
    1. Import the rejected rows to T1 to get the SQL20305N message. The import will stop on the first error that it encounters.
    2. Look in the db2diag log file and get the generated XQuery statement. To find the failing node values in the input document, search for the string 'SQL20305N' in the db2diag log file and match the reason code number. Following the reason code, you will find a set of instructions and then a generated XQuery statement that you can use to locate the problem value in the document that caused the error.
    3. Modify the XQuery statement to use the new table T2.
    4. Run the XQuery statement on T2 to locate the problem value in the document.
    5. Fix the problem value in the .xml file containing the document.
    6. Return to Step a and import the rejected rows to T1 again. The row that caused the import to stop should now be inserted successfully. If there is another rejected row in the .del file, the import utility will stop on the next error and output another SQL20305N message. Continue these steps until the import runs successfully.

Example

In the following example, the index BirthdateIndex has been created on the date data type. The REJECT INVALID VALUES option is specified, so the XML pattern values for /Person/Confidential/Birthdate must all be valid for the date data type. If any XML pattern value cannot be cast to this data type, an error is returned.

Using the following XML documents, five rows are supposed to be loaded but the first and the fourth rows will be rejected because the Birthdate values cannot be indexed. In the file person1.xml, the value March 16, 2002 is not in the correct date format. In the file person4.xml, the value 20000-12-09 has an extra zero for the year, so it is a valid XML date value but it is outside of the range that Db2® allows for a year (0001 to 9999). Some of the sample output has been edited to make the example more concise.

The five XML files to load are as follows:

person1.xml (Birthdate value is not valid)
<?xml version="1.0"?>
<Person gender="Male">
  <Name>
    <Last>Cool</Last>
    <First>Joe</First>
  </Name>
  <Confidential>
    <Age unit="years">5</Age>
    <Birthdate>March 16, 2002</Birthdate>
    <SS>111-22-3333</SS> 
  </Confidential>
  <Address>5224 Rose St. San Jose, CA 95123</Address>  
</Person>
person2.xml (Birthdate value is valid)
<?xml version="1.0"?>
<Person gender="Male">
  <Name>
    <Last>Cool</Last>
    <First>Joe</First>
  </Name>
  <Confidential>
    <Age unit="years">5</Age>
    <Birthdate>2002-03-16</Birthdate>
    <SS>111-22-3333</SS> 
  </Confidential>
  <Address>5224 Rose St. San Jose, CA 95123</Address>
</Person>
person3.xml (Birthdate value is valid)
<?xml version="1.0"?>
<Person gender="Female">
  <Name>
    <Last>McCarthy</Last>
    <First>Laura</First>
  </Name>
  <Confidential>
    <Age unit="years">6</Age>
    <Birthdate>2001-03-12</Birthdate>
    <SS>444-55-6666</SS> 
  </Confidential>
  <Address>5960 Daffodil Lane, San Jose, CA 95120</Address>
</Person>
person4.xml (Birthdate value is not valid)
<?xml version="1.0"?>
<Person gender="Female">
  <Name>
    <Last>Wong</Last>
    <First>Teresa</First>
  </Name>
  <Confidential>
    <Age unit="years">7</Age>
    <Birthdate>20000-12-09</Birthdate>
    <SS>555-66-7777</SS> 
  </Confidential>
  <Address>5960 Tulip Court, San Jose, CA 95120</Address>
</Person>
person5.xml (Birthdate value is valid)
<?xml version="1.0"?>
<Person gender="Male">
  <Name>
    <Last>Smith</Last>
    <First>Chris</First>
  </Name>
  <Confidential>
    <Age unit="years">10</Age>
    <Birthdate>1997-04-23</Birthdate>
    <SS>666-77-8888</SS> 
  </Confidential>
  <Address>5960 Dahlia Street, San Jose, CA 95120</Address>
</Person>
The input file person.del contains:
1, <XDS FIL='person1.xml'/>
2, <XDS FIL='person2.xml'/>
3, <XDS FIL='person3.xml'/>
4, <XDS FIL='person4.xml'/>
5, <XDS FIL='person5.xml'/>
The DDL and LOAD statements are as follows:
CREATE TABLE T1 (docID INT, XMLDoc XML);

CREATE INDEX BirthdateIndex ON T1(xmlDoc)
   GENERATE KEY USING XMLPATTERN '/Person/Confidential/Birthdate' AS SQL DATE
   REJECT INVALID VALUES;

LOAD FROM person.del OF DEL INSERT INTO T1
To resolve the indexing errors that would occur when you attempt to load the set of XML files listed previously, you would perform the following steps:
  1. Determine which rows were rejected during the load operation using the record numbers in the output information. In the following output, record number 1 and record number 4 were rejected.
    SQL20305N  An XML value cannot be inserted or updated because of an error 
    detected when inserting or updating the index identified by "IID = 3" on table 
    "LEECM.T1".  Reason code = "5".  For reason codes related to an XML schema the 
    XML schema identifier = "*N" and XML schema data type = "*N".  SQLSTATE=23525
    
    SQL3185W  The previous error occurred while processing data from row "F0-1" of 
    the input file.
    
    SQL20305N  An XML value cannot be inserted or updated because of an error 
    detected when inserting or updating the index identified by "IID = 3" on table 
    "LEECM.T1".  Reason code = "4".  For reason codes related to an XML schema the 
    XML schema identifier = "*N" and XML schema data type = "*N".  SQLSTATE=23525
    
    SQL3185W  The previous error occurred while processing data from row "F0-4" of 
    the input file.
    
    SQL3227W  Record token "F0-1" refers to user record number "1".
    
    SQL3227W  Record token "F0-4" refers to user record number "4".
    
    SQL3107W  There is at least one warning message in the message file.
    
    Number of rows read         = 5
    Number of rows skipped      = 0
    Number of rows loaded       = 3
    Number of rows rejected     = 2
    Number of rows deleted      = 0
    Number of rows committed    = 5
  2. Create a new file reject.del with the rejected rows.
    1, <XDS FIL='person1.xml'/>
    4, <XDS FIL='person4.xml'/>
  3. Create a new table T2 with the same columns as the original table T1. Do not create any indexes on the new table.
    CREATE TABLE T2 LIKE T1
  4. Load the rejected rows into the new table T2.
    LOAD FROM reject.del OF DEL INSERT INTO T2;
  5. For rejected row 1 in the original table T1:
    1. Import the rejected rows to T1 to get the -20305 message
      IMPORT FROM reject.del OF DEL INSERT INTO T1
      SQL3109N  The utility is beginning to load data from file "reject.del".
      
      SQL3306N  An SQL error "-20305" occurred while inserting a row into the 
                table.
      
      SQL20305N  An XML value cannot be inserted or updated because of an error 
      detected when inserting or updating the index identified by "IID = 3" on 
      table "LEECM.T1".  Reason code = "5".  For reason codes related to an XML 
      schema the XML schema identifier = "*N" and XML schema data type = "*N".  
      SQLSTATE=23525
      
      SQL3110N  The utility has completed processing.  "1" rows were read from
      the input file.
    2. Look in the db2diag log file and get the generated XQuery statement.
      FUNCTION: DB2 UDB, Xml Storage and Index Manager, xmlsDumpXQuery, probe:608
      DATA #1 : String, 36 bytes
      SQL Code: SQL20305N ; Reason Code: 5
      DATA #2 : String, 265 bytes
      To locate the value in the document that caused the error, create a 
      table with one XML column and insert the failing document in the table. 
      Replace the table and column name in the query below with the created 
      table and column name and execute the following XQuery.
      DATA #3 : String, 247 bytes
      xquery for $i in db2-fn:xmlcolumn(
        "LEECM.T1.XMLDOC")[/*:Person/*:Confidential/*:Birthdate="March 16, 2002"] 
      return 
      <Result>
       <ProblemDocument> {$i} </ProblemDocument>
       <ProblemValue>{$i/*:Person/*:Confidential/*:Birthdate/..} </ProblemValue>
      </Result>;
    3. Modify the XQuery statement to use the new table T2.
      xquery for $i in db2-fn:xmlcolumn(
        "LEECM.T2.XMLDOC")[/*:Person/*:Confidential/*:Birthdate="March 16, 2002"] 
      return 
      <Result>
       <ProblemDocument> {$i} </ProblemDocument>
       <ProblemValue>{$i/*:Person/*:Confidential/*:Birthdate/..} </ProblemValue>
      </Result>;
    4. Run the XQuery statement on table T2 to locate the problem value in the document.
      <Result><ProblemDocument><Person gender="Male">
        <Name>
          <Last>Cool</Last>
          <First>Joe</First>
        </Name>
        <Confidential>
          <Age unit="years">5</Age>
          <Birthdate>March 16, 2002</Birthdate>
          <SS>111-22-3333</SS> 
        </Confidential>
        <Address>5224 Rose St. San Jose, CA 95123</Address>  
      </Person></ProblemDocument><ProblemValue><Confidential>
          <Age unit="years">5</Age>
          <Birthdate>March 16, 2002</Birthdate>
          <SS>111-22-3333</SS> 
        </Confidential></ProblemValue></Result> 
    5. Fix the problem value in the file person1.xml containing the document. March 16, 2002 is not in the correct date format so it is changed to 2002-03-16.
      <?xml version="1.0"?>
      <Person gender="Male">
        <Name>
          <Last>Cool</Last>
          <First>Joe</First>
        </Name>
        <Confidential>
          <Age unit="years">5</Age>
          <Birthdate>2002-03-16</Birthdate>
          <SS>111-22-3333</SS> 
        </Confidential>
        <Address>5224 Rose St. San Jose, CA 95123</Address>  
      </Person>
    6. Go back to step a. to import the rejected rows to table T1 again.
  6. (First repetition of Step 5)
    1. Import the rejected rows to table T1. The first row is now imported successfully because two rows were read from the import file. A new error occurs on the second row.
      IMPORT FROM reject.del OF DEL INSERT INTO T1
      SQL3109N  The utility is beginning to load data from file "reject.del".
      
      SQL3306N  An SQL error "-20305" occurred while inserting a row into the 
                table.
      
      SQL20305N  An XML value cannot be inserted or updated because of an error 
      detected when inserting or updating the index identified by "IID = 3" on 
      table "LEECM.T1".  Reason code = "4".  For reason codes related to an XML 
      schema the XML schema identifier = "*N" and XML schema data type = "*N".  
      SQLSTATE=23525
      
      SQL3110N  The utility has completed processing.  "2" rows were read from 
      the input file.
    2. Look in the db2diag log file and get the generated XQuery statement.
      FUNCTION: DB2 UDB, Xml Storage and Index Manager, xmlsDumpXQuery, probe:608
      DATA #1 : String, 36 bytes
      SQL Code: SQL20305N ; Reason Code: 4
      DATA #2 : String, 265 bytes
      To locate the value in the document that caused the error, create a 
      table with one XML column and insert the failing document in the table. 
      Replace the table and column name in the query below with the created 
      table and column name and execute the following XQuery.
      DATA #3 : String, 244 bytes
      xquery for $i in db2-fn:xmlcolumn("LEECM.T1.XMLDOC")
         [/*:Person/*:Confidential/*:Birthdate="20000-12-09"] 
      return 
      <Result>
       <ProblemDocument> {$i} </ProblemDocument>
       <ProblemValue>{$i/*:Person/*:Confidential/*:Birthdate/..} </ProblemValue>
      </Result>;  
    3. Modify the XQuery statement to use table T2.
      xquery for $i in db2-fn:xmlcolumn("LEECM.T2.XMLDOC")
         [/*:Person/*:Confidential/*:Birthdate="20000-12-09"] 
      return 
      <Result>
       <ProblemDocument> {$i} </ProblemDocument>
       <ProblemValue>{$i/*:Person/*:Confidential/*:Birthdate/..} </ProblemValue>
      </Result>;
    4. Run the XQuery statement to locate the problem value in the document.
      <Result><ProblemDocument><Person gender="Female">
        <Name>
          <Last>Wong</Last>
          <First>Teresa</First>
        </Name>
        <Confidential>
          <Age unit="years">7</Age>
          <Birthdate>20000-12-09</Birthdate>
          <SS>555-66-7777</SS> 
        </Confidential>
        <Address>5960 Tulip Court, San Jose, CA 95120</Address>
      </Person></ProblemDocument><ProblemValue><Confidential>
          <Age unit="years">7</Age>
          <Birthdate>20000-12-09</Birthdate>
          <SS>555-66-7777</SS> 
        </Confidential></ProblemValue></Result>
    5. Fix the problem value in the file person4.xml containing the document. The value 20000-12-09 has an extra zero for the year so it is outside of the range that Db2 allows for a year (0001 to 9999).. The value is changed to 2000-12-09.
      <?xml version="1.0"?>
      <Person gender="Female">
        <Name>
          <Last>Wong</Last>
          <First>Teresa</First>
        </Name>
        <Confidential>
          <Age unit="years">7</Age>
          <Birthdate>2000-12-09</Birthdate>
          <SS>555-66-7777</SS> 
        </Confidential>
        <Address>5960 Tulip Court, San Jose, CA 95120</Address>
      </Person>
    6. Go back to step a to import the rejected rows to T1 again.
  7. (Second repetition of Step 5)
    1. Import the rejected rows to T1.
      IMPORT FROM reject.del OF DEL INSERT INTO T1
      SQL3109N  The utility is beginning to load data from file "reject.del".
      
      SQL3110N  The utility has completed processing.  "2" rows were read from 
      the input file.
      
      SQL3221W  ...Begin COMMIT WORK. Input Record Count = "2".
      
      SQL3222W  ...COMMIT of any database changes was successful.
      
      SQL3149N  "2" rows were processed from the input file.  "2" rows were 
      successfully inserted into the table.  "0" rows were rejected.
      
      Number of rows read         = 2
      Number of rows skipped      = 0
      Number of rows inserted     = 2
      Number of rows updated      = 0
      Number of rows rejected     = 0
      Number of rows committed    = 2
    The problem is now resolved. All of the rows of person.del are successfully inserted into table T1.