Resolving indexing errors when loading XML data
Load operations that fail due to indexing errors can be resolved using the db2diag log file and the import utility together to identify and correct problem values in the XML data.
About this task
If a load operation returns the error message SQL20305N
(sqlcode -20305), this indicates that one or more XML node values
could not be indexed. The error message will output the reason code
for the error. Enter ? SQL20305N
in the command line
processor to look up the explanation and user response for the corresponding
reason code.
For indexing problems during insert operations, a generated XQuery statement is output to the db2diag log file to help locate the failing XML node values within the document. See Common XML indexing issues for details about how to use the XQuery statement to locate the failing XML node values.
For indexing problems during load operations, however, the generated XQuery statements are not output to the db2diag log file. To generate these XQuery statements the import utility must be run on the failing rows that were not loaded. Because the rejected rows do not exist in the table, the XQuery statements cannot be run on the failing documents. To solve this problem, a new table with the same definition must be created without any indexes. The failing rows can then be loaded into the new table, and the XQuery statements can then be run on the new table to locate the failing XML node values within the documents.
Perform the following steps to resolve the indexing errors:
Procedure
Example
In the following example, the index BirthdateIndex has been created on the date data type. The REJECT INVALID VALUES option is specified, so the XML pattern values for /Person/Confidential/Birthdate must all be valid for the date data type. If any XML pattern value cannot be cast to this data type, an error is returned.
Using the following XML documents, five rows are supposed to be loaded but the first and the
fourth rows will be rejected because the Birthdate values cannot be indexed. In the file
person1.xml, the value March 16, 2002
is not in the correct
date format. In the file person4.xml, the value 20000-12-09
has an extra zero for the year, so it is a valid XML date value but it is outside of the range that
Db2® allows for a
year (0001 to 9999). Some of the sample output has been edited to make the example more concise.
The five XML files to load are as follows:
<?xml version="1.0"?>
<Person gender="Male">
<Name>
<Last>Cool</Last>
<First>Joe</First>
</Name>
<Confidential>
<Age unit="years">5</Age>
<Birthdate>March 16, 2002</Birthdate>
<SS>111-22-3333</SS>
</Confidential>
<Address>5224 Rose St. San Jose, CA 95123</Address>
</Person>
<?xml version="1.0"?>
<Person gender="Male">
<Name>
<Last>Cool</Last>
<First>Joe</First>
</Name>
<Confidential>
<Age unit="years">5</Age>
<Birthdate>2002-03-16</Birthdate>
<SS>111-22-3333</SS>
</Confidential>
<Address>5224 Rose St. San Jose, CA 95123</Address>
</Person>
<?xml version="1.0"?>
<Person gender="Female">
<Name>
<Last>McCarthy</Last>
<First>Laura</First>
</Name>
<Confidential>
<Age unit="years">6</Age>
<Birthdate>2001-03-12</Birthdate>
<SS>444-55-6666</SS>
</Confidential>
<Address>5960 Daffodil Lane, San Jose, CA 95120</Address>
</Person>
<?xml version="1.0"?>
<Person gender="Female">
<Name>
<Last>Wong</Last>
<First>Teresa</First>
</Name>
<Confidential>
<Age unit="years">7</Age>
<Birthdate>20000-12-09</Birthdate>
<SS>555-66-7777</SS>
</Confidential>
<Address>5960 Tulip Court, San Jose, CA 95120</Address>
</Person>
<?xml version="1.0"?>
<Person gender="Male">
<Name>
<Last>Smith</Last>
<First>Chris</First>
</Name>
<Confidential>
<Age unit="years">10</Age>
<Birthdate>1997-04-23</Birthdate>
<SS>666-77-8888</SS>
</Confidential>
<Address>5960 Dahlia Street, San Jose, CA 95120</Address>
</Person>
1, <XDS FIL='person1.xml'/>
2, <XDS FIL='person2.xml'/>
3, <XDS FIL='person3.xml'/>
4, <XDS FIL='person4.xml'/>
5, <XDS FIL='person5.xml'/>
CREATE TABLE T1 (docID INT, XMLDoc XML);
CREATE INDEX BirthdateIndex ON T1(xmlDoc)
GENERATE KEY USING XMLPATTERN '/Person/Confidential/Birthdate' AS SQL DATE
REJECT INVALID VALUES;
LOAD FROM person.del OF DEL INSERT INTO T1
- Determine which rows were rejected during the load operation using
the record numbers in the output information. In the following output,
record number 1 and record number 4 were rejected.
SQL20305N An XML value cannot be inserted or updated because of an error detected when inserting or updating the index identified by "IID = 3" on table "LEECM.T1". Reason code = "5". For reason codes related to an XML schema the XML schema identifier = "*N" and XML schema data type = "*N". SQLSTATE=23525 SQL3185W The previous error occurred while processing data from row "F0-1" of the input file. SQL20305N An XML value cannot be inserted or updated because of an error detected when inserting or updating the index identified by "IID = 3" on table "LEECM.T1". Reason code = "4". For reason codes related to an XML schema the XML schema identifier = "*N" and XML schema data type = "*N". SQLSTATE=23525 SQL3185W The previous error occurred while processing data from row "F0-4" of the input file. SQL3227W Record token "F0-1" refers to user record number "1". SQL3227W Record token "F0-4" refers to user record number "4". SQL3107W There is at least one warning message in the message file. Number of rows read = 5 Number of rows skipped = 0 Number of rows loaded = 3 Number of rows rejected = 2 Number of rows deleted = 0 Number of rows committed = 5
- Create a new file reject.del with the rejected
rows.
1, <XDS FIL='person1.xml'/> 4, <XDS FIL='person4.xml'/>
- Create a new table T2 with the same columns as the original table
T1. Do not create any indexes on the new table.
CREATE TABLE T2 LIKE T1
- Load the rejected rows into the new table T2.
LOAD FROM reject.del OF DEL INSERT INTO T2;
- For rejected row 1 in the original table T1:
- Import the rejected rows to T1 to get the -20305 message
IMPORT FROM reject.del OF DEL INSERT INTO T1 SQL3109N The utility is beginning to load data from file "reject.del". SQL3306N An SQL error "-20305" occurred while inserting a row into the table. SQL20305N An XML value cannot be inserted or updated because of an error detected when inserting or updating the index identified by "IID = 3" on table "LEECM.T1". Reason code = "5". For reason codes related to an XML schema the XML schema identifier = "*N" and XML schema data type = "*N". SQLSTATE=23525 SQL3110N The utility has completed processing. "1" rows were read from the input file.
- Look in the db2diag log file
and get the generated XQuery statement.
FUNCTION: DB2 UDB, Xml Storage and Index Manager, xmlsDumpXQuery, probe:608 DATA #1 : String, 36 bytes SQL Code: SQL20305N ; Reason Code: 5 DATA #2 : String, 265 bytes To locate the value in the document that caused the error, create a table with one XML column and insert the failing document in the table. Replace the table and column name in the query below with the created table and column name and execute the following XQuery. DATA #3 : String, 247 bytes xquery for $i in db2-fn:xmlcolumn( "LEECM.T1.XMLDOC")[/*:Person/*:Confidential/*:Birthdate="March 16, 2002"] return <Result> <ProblemDocument> {$i} </ProblemDocument> <ProblemValue>{$i/*:Person/*:Confidential/*:Birthdate/..} </ProblemValue> </Result>;
- Modify the XQuery statement to use the new table T2.
xquery for $i in db2-fn:xmlcolumn( "LEECM.T2.XMLDOC")[/*:Person/*:Confidential/*:Birthdate="March 16, 2002"] return <Result> <ProblemDocument> {$i} </ProblemDocument> <ProblemValue>{$i/*:Person/*:Confidential/*:Birthdate/..} </ProblemValue> </Result>;
- Run the XQuery statement on table T2 to locate the problem value
in the document.
<Result><ProblemDocument><Person gender="Male"> <Name> <Last>Cool</Last> <First>Joe</First> </Name> <Confidential> <Age unit="years">5</Age> <Birthdate>March 16, 2002</Birthdate> <SS>111-22-3333</SS> </Confidential> <Address>5224 Rose St. San Jose, CA 95123</Address> </Person></ProblemDocument><ProblemValue><Confidential> <Age unit="years">5</Age> <Birthdate>March 16, 2002</Birthdate> <SS>111-22-3333</SS> </Confidential></ProblemValue></Result>
- Fix the problem value in the file person1.xml containing
the document.
March 16, 2002
is not in the correct date format so it is changed to2002-03-16
.<?xml version="1.0"?> <Person gender="Male"> <Name> <Last>Cool</Last> <First>Joe</First> </Name> <Confidential> <Age unit="years">5</Age> <Birthdate>2002-03-16</Birthdate> <SS>111-22-3333</SS> </Confidential> <Address>5224 Rose St. San Jose, CA 95123</Address> </Person>
- Go back to step a. to import the rejected rows to table T1 again.
- Import the rejected rows to T1 to get the -20305 message
- (First repetition of Step 5)
- Import the rejected rows to table T1. The first row is now imported
successfully because two rows were read from the import file. A new
error occurs on the second row.
IMPORT FROM reject.del OF DEL INSERT INTO T1 SQL3109N The utility is beginning to load data from file "reject.del". SQL3306N An SQL error "-20305" occurred while inserting a row into the table. SQL20305N An XML value cannot be inserted or updated because of an error detected when inserting or updating the index identified by "IID = 3" on table "LEECM.T1". Reason code = "4". For reason codes related to an XML schema the XML schema identifier = "*N" and XML schema data type = "*N". SQLSTATE=23525 SQL3110N The utility has completed processing. "2" rows were read from the input file.
- Look in the db2diag log file
and get the generated XQuery statement.
FUNCTION: DB2 UDB, Xml Storage and Index Manager, xmlsDumpXQuery, probe:608 DATA #1 : String, 36 bytes SQL Code: SQL20305N ; Reason Code: 4 DATA #2 : String, 265 bytes To locate the value in the document that caused the error, create a table with one XML column and insert the failing document in the table. Replace the table and column name in the query below with the created table and column name and execute the following XQuery. DATA #3 : String, 244 bytes xquery for $i in db2-fn:xmlcolumn("LEECM.T1.XMLDOC") [/*:Person/*:Confidential/*:Birthdate="20000-12-09"] return <Result> <ProblemDocument> {$i} </ProblemDocument> <ProblemValue>{$i/*:Person/*:Confidential/*:Birthdate/..} </ProblemValue> </Result>;
- Modify the XQuery statement to use table T2.
xquery for $i in db2-fn:xmlcolumn("LEECM.T2.XMLDOC") [/*:Person/*:Confidential/*:Birthdate="20000-12-09"] return <Result> <ProblemDocument> {$i} </ProblemDocument> <ProblemValue>{$i/*:Person/*:Confidential/*:Birthdate/..} </ProblemValue> </Result>;
- Run the XQuery statement to locate the problem value in the document.
<Result><ProblemDocument><Person gender="Female"> <Name> <Last>Wong</Last> <First>Teresa</First> </Name> <Confidential> <Age unit="years">7</Age> <Birthdate>20000-12-09</Birthdate> <SS>555-66-7777</SS> </Confidential> <Address>5960 Tulip Court, San Jose, CA 95120</Address> </Person></ProblemDocument><ProblemValue><Confidential> <Age unit="years">7</Age> <Birthdate>20000-12-09</Birthdate> <SS>555-66-7777</SS> </Confidential></ProblemValue></Result>
- Fix the problem value in the file person4.xml containing the document. The
value
20000-12-09
has an extra zero for the year so it is outside of the range that Db2 allows for a year (0001 to 9999).. The value is changed to2000-12-09
.<?xml version="1.0"?> <Person gender="Female"> <Name> <Last>Wong</Last> <First>Teresa</First> </Name> <Confidential> <Age unit="years">7</Age> <Birthdate>2000-12-09</Birthdate> <SS>555-66-7777</SS> </Confidential> <Address>5960 Tulip Court, San Jose, CA 95120</Address> </Person>
- Go back to step a to import the rejected rows to T1 again.
- Import the rejected rows to table T1. The first row is now imported
successfully because two rows were read from the import file. A new
error occurs on the second row.
- (Second repetition of Step 5)
- Import the rejected rows to T1.
IMPORT FROM reject.del OF DEL INSERT INTO T1 SQL3109N The utility is beginning to load data from file "reject.del". SQL3110N The utility has completed processing. "2" rows were read from the input file. SQL3221W ...Begin COMMIT WORK. Input Record Count = "2". SQL3222W ...COMMIT of any database changes was successful. SQL3149N "2" rows were processed from the input file. "2" rows were successfully inserted into the table. "0" rows were rejected. Number of rows read = 2 Number of rows skipped = 0 Number of rows inserted = 2 Number of rows updated = 0 Number of rows rejected = 0 Number of rows committed = 2
- Import the rejected rows to T1.