One of the most significant new features in IBM® DB2® 9.5 for Linux, Unix and Windows is the XML update functionality. The previous version, DB2 9, introduced pureXML™ support for storing and indexing of XML data and querying it with the SQL/XML and XQuery languages. Modifications to an XML document were performed outside of the database server followed by an update of the full document in DB2. Now DB2 9.5 introduces the XQuery Update Facility, a standardized extension to XQuery that allows you to modify, insert, or delete individual elements and attributes within an XML document. This makes updating XML data easier and provides higher performance. This article describes the new XML update functionality, presents examples of typical XML update operations, and discusses how to avoid common pitfalls.

Matthias Nicola (mnicola@us.ibm.com), Senior Software Engineer, IBM Silicon Valley Laboratory

Author photo: Matthias NicolaDr. Nicola is the technical lead for XML database performance at IBM's Silicon Valley Lab. His work focuses on all aspects of XML performance in DB2, including XQuery, SQL/XML, and all native XML features in DB2. Dr. Nicola works closely with the DB2 XML development teams as well as with customers and business partners who are using XML, assisting them in the design, implementation, and optimization of XML solutions. Prior to joining IBM, Dr. Nicola worked on data warehousing performance for Informix Software. He also worked for four years in research and industry projects on distributed and replicated databases. He received his doctorate in computer science in 1999 from the Technical University of Aachen, Germany.


developerWorks Contributing author
        level

Uttam Jain (uttam@us.ibm.com), Advisory Software Engineer, IBM

Photo: Uttam JainUttam Jain is an Advisory Software Engineer and the technical lead for DB2 pureXML storage. He joined the IBM Silicon Valley Laboratory in 2001 and has worked on the implementation of object-relational constructs in DB2 prior to joining the DB2 pureXML team. He received a Masters degree in Electrical and Computer Engineering from the University of Florida in 2000. His current interests include XQuery, SQL/XML, and distributed XML databases.



11 October 2007

Also available in Japanese

Introduction and background

In many respects, DB2 9 pureXML was ahead of the industry in terms of database management support for XML. Indeed, when IBM was developing pureXML technology, the World Wide Web Consortium (W3C), the body that defines and maintains the XML and XQuery standards, had only just begun defining a standard update mechanism for XML. Due to this lack of an XML update standard, DB2 9 supports full document replacement through SQL UPDATE statements as shown in the following example:

create table xmlcustomer (cid bigint, info XML);

update xmlcustomer 
set info =  '<customerinfo><name>John Doe</name>...</customerinfo>' 
where cid = 1783;

The set clause of the UPDATE statement replaces the existing document in a given row with a new document. Here the document is provided as a literal value, but it could also be passed to the database through a parameter marker or host variable. With DB2 9, applications that need to modify individual elements or attributes in an XML document usually do so in the application itself. This typically leads to the following sequence of processing, as illustrated in Figure 1.

  1. The application submits an SQL or XQuery statement to read an XML document.
  2. The document is retrieved from the database and serialized to text format.
  3. The document is transmitted to the client.
  4. The application parses the XML document, typically using the document object model (DOM).
  5. The application modifies the document using the DOM API.
  6. The document is re-serialized in the client application.
  7. The application submits an SQL UPDATE statement and transmits the updated document to the database server.
  8. The database server parses the updated XML document and replaces the old document.
Figure 1. Updating an XML document in DB2 9
Updating an XML document in DB2 9

This process of retrieving, parsing, and changing an XML document can also be encapsulated in a stored procedure that runs on the database server. The processing steps are the same as shown in Figure 1 except that the document is not shipped to the client and back.

The new update functionality in DB2 9.5 reduces this process to a single step. The application simply sends an SQL UPDATE statement with an embedded XQuery transform expression to the DB2 server (Figure 2). This expression is part of the emerging XQuery standard for updating XML data.

Figure 2. Updating an XML document in DB2 9.5
Updating an XML document in DB2 9.5

When DB2 9.5 executes the UPDATE statement, it locates the relevant document(s) and modifies the specified elements or attributes. This happens within the DB2 storage layer, that is the document stays in DB2's internal hierarchical XML format the entire time, without any parsing or serialization. Concurrency control and logging happens on the level of full documents. Overall, this new update process can often be 2x to 4x faster than the process in Figure 1.

The new update functionality allows you to perform node-level changes in XML documents. In most cases, the nodes that you modify are elements or attributes. You can:

  • Replace the value of a node
  • Replace a node with a new one
  • Insert a new node (even at a specific location, such as before or after a given node)
  • Delete a node
  • Rename a node
  • Modify multiple nodes in a document in a single UPDATE statement
  • Update multiple documents in a single UPDATE statement

The remainder of this article describes how to perform such XML updates with XQuery transform expressions. You'll see how to embed a transform in UPDATE statements to permanently change data on disk, and in queries, to modify XML data "on the fly" while reading it out without permanently changing it. The latter can be useful if applications need to receive an XML format that's different from the one in the database.


Sample data

To illustrate the new DB2 9.5 XML update functionality, the sample table and data defined by the following CREATE TABLE and INSERT statements are used. The table xmlcustomer has two columns of types bigint and XML, respectively, and contains two rows of data. The rows can be identified by cid values 1000 and 1001, respectively. The sample data and all updates shown in this article are available for download as a text file in the Download section.

create table xmlcustomer (cid bigint, info XML);
insert into xmlcustomer values (1000, '
<customerinfo>
        <name>John Smith</name>
         <addr country="Canada">
                <street>Fourth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </addr>
        <phone type="work">963-289-4136</phone>
</customerinfo>');
insert into xmlcustomer values (1001, '
<customerinfo>
        <name>David Patterson</name>
        <addr country="Canada">
                <street>Fifth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </addr>
        <phone type="work">222-222-2222</phone>
        <phone type="cell">333-333-3333</phone>
</customerinfo>');

The hierarchical structure of the sample documents is shown in Figure 3. It is useful to have this tree structure in mind when designing XML queries or updates, since ultimately they always navigate the tree.

Figure 3. Tree structure of the sample documents
Tree structure of the sample documents

Update XML data -- A simple example

Start with Update 1 as a simple example of an XML update. It changes the phone number of a certain customer. At a high level, you see the familiar "update… set…. where…" syntax of an SQL UPDATE statement. You also see that the "set" clause assigns a new value to the XML column "info", and that this new value is the result of an XMLQUERY function.

You probably remember the XMLQUERY function from querying XML data with SQL/XML. XMLQUERY is often used in the SELECT clause of an SQL query to extract or construct XML values based on the XML document in a given row of the result set. To achieve this, the XMLQUERY function allows you to embed any XQuery expression in an SQL statement. In this example, the XMLQUERY function is used to modify the given XML document. The XQuery expression in Update 1 is a transform expression. The transform expression is at the core of the XQuery Update Facility, an emerging standard to extend XQuery to modify XML documents.

A transform expression starts with the optional "transform" keyword followed by a COPY, MODIFY, and a RETURN clause. The intuitive idea of the transform expression is that the COPY clause assigns the input document from the XML column (info) to a variable (in our case: $new), then the MODIFY clause applies one or more update operations to that variable, and finally the RETURN clause produces the result of the transform expression.

Update 1
update xmlcustomer
set info = xmlquery( 'transform 
                   copy $new := $INFO
                   modify do replace value of $new/customerinfo/phone with "905-477-9011"
                   return  $new') 
where cid = 1000;

The XML document before update:

<customerinfo>
        <name>John Smith</name>
        <addr country=“Canada">
                <street>Fourth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </addr>
        <phone type="work">963-289-4136</phone>
</customerinfo>

The XML document after update:

<customerinfo>
        <name>John Smith</name>
        <addr country=“Canada">
                <street>Fourth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>90111</zipcode>
        </addr>
        <phone type="work">905-477-9011</phone>
</customerinfo>

The MODIFY clause is of particular interest since it defines the actual update operations that take place. Look at various MODIFY examples in this article. Here, the value of the "phone" element is simply replaced with a new value (905-477-9011). In Update 1, and many other typical use cases, the RETURN clause simply returns the variable that holds the modified document. However, the return could be a much more complex expression, including element construction and FLWOR expression. Similarly, the right-hand side of the COPY clause is often simply the variable that holds the input document, in our case $INFO, but it could be more complex than that. The right-hand side of the COPY clause must evaluate to a single node, meaning it cannot be an empty sequence or a sequence of more than one item. This single node can have descendants, which means it can be (and often is) the root of a full XML document.

Since the "transform" keyword is optional, it is omitted from here on.


Update a document in an SQL query

What sounds like a contradiction at first is actually real and very easy. A transform expression can be used in a query to modify an XML document as it is read from the database and passed to the application. Update 2 is an SQL SELECT statement with the same XMLQUERY function, transform expression, and WHERE clause, as in Update 1. While Update 1 makes a permanent and logged change to a document on disk, Update 2 leaves the original document in the table untouched and only modifies it "on the fly" as part of the query processing. Here the update is used to blacken out all but the area code of the phone number.

Update 2
select xmlquery('copy $new := $INFO
                modify do replace value of $new/customerinfo/phone with "905-xxx-xxxx"
                return $new ') 
from xmlcustomer
where cid = 1000;

Most of the examples in this article show the transform expression in an SQL UPDATE statement. However, they could just as well appear in queries as shown in Update 2. Later examples in this article show that TRANSFROM expressions in queries can also be useful to remove certain elements or attributes from the document, or to expose elements under a different name. Also, to explore and test the new update capabilities initially, you may find it convenient to embed the XMLQUERY function with the transform in a SELECT statement rather than an UPDATE. This way you can immediately see and verify the effect of the update and won't ruin the data in your table if an update does something unexpected.


Updates with parameter markers

In Update 1 and Update 2 above, the new value for the "phone" element is hard-coded in the statement text. Often you will want to prepare and compile the UPDATE statement only once, and then pass in a new value every time you execute the update. This avoids recompiling the statement in the database server for each execution. Update 3 allows you to do just that by using an SQL-style parameter marker ("?") in the XMLQUERY function. The parameter marker is passed as a variable ($z) into the XQuery expression. Update 3 also uses a second parameter marker in the WHERE clause to select the row that is updated.

Update 3
update xmlcustomer
set info = xmlquery('copy $new := $INFO
              modify do replace value of $new/customerinfo/phone with $z
              return  $new '  passing  cast(? as varchar(15)) as "z")
where cid = ?

The new value that you use to replace an existing value can also be computed. For example, to add an extension to an existing phone number, your modify clause could be: modify do replace value of $new/customerinfo/phone with concat($new/customerinfo/phone, "x301"). Or, if the customer had a balance element, you can increase his balance like this: modify do replace value of $new/customerinfo/balance with $new/customerinfo/balance + 1000.


Updates with schema validation

When you modify an XML document you may wish to verify whether it is still compliant with a given XML schema. Update 4 uses the XMLVALIDATE function to check whether the updated document is valid according to the XML schema with the identifier "cust.custschema." The UPDATE statement fails if the modified document is not valid.

Update 4
update xmlcustomer
set info = xmlvalidate(xmlquery('
              copy $new := $INFO
              modify do replace value of $new/customerinfo/phone 
              with "905-477-9011"
              return  $new ')
              according to xmlschema id "cust.custschema")
where cid = 1000;

XML updates by example

The examples you have seen so far use the "replace value of" operation to change the value of an element node. In the following examples, you will see how to perform other operations, such as deleting, inserting, and renaming of element and attribute nodes in XML documents.

Delete nodes

Update 5 shows how to delete the element "phone" from an XML document using the "delete" operation in the MODIFY clause. You simply specify the path to the node(s) that you want to remove from the document. Note that everything except the MODIFY clause is identical to previous updates that were discussed. Indeed, in the subsequent examples you'll see that only the MODIFY clause is changed to express different kinds of updates.

Update 5
update xmlcustomer
set info = xmlquery('copy $new := $INFO
              modify do delete $new/customerinfo/phone
              return  $new')
where cid = 1000;

The XML document before update:

<customerinfo>
        <name>John Smith</name>
        <addr country="Canada">
                <street>Fourth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </addr>
        <phone type="work">963-289-4136</phone>
</customerinfo>

The XML document after update:

<customerinfo>
        <name>John Smith</name>
        <addr country="Canada">
                <street>Fourth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </addr>
</customerinfo>

All of the node-level updates can be applied to attributes as well. Update 6 is an example of deleting an attribute, such as the "type" attribute in the "phone" element.

Update 6
update xmlcustomer
set info = xmlquery('copy $new := $INFO
              modify do delete $new/customerinfo/phone/@type
              return $new' )
where cid = 1000;

The XML document before update:

<customerinfo>
        <name>John Smith</name>
        <addr country="Canada">
                <street>Fourth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </addr>
        <phone type="work">963-289-4136</phone>
</customerinfo>

The XML document after update:

<customerinfo>
        <name>John Smith</name>
        <addr country="Canada">
                <street>Fourth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </addr>
        <phone>963-289-4136</phone>
</customerinfo>

Rename nodes

You can use the "rename" operation to give an element, attribute, or a processing-instruction node a different name. Other nodes, such as text nodes or XML comments, do not have a node name and therefore cannot be renamed. A rename example is shown in Update 7, which renames the "addr" element to "address." This update does not use an SQL WHERE clause to select a specific row. Therefore, all rows (documents) in the table are modified. Obviously, this can take longer than updating just a single document.

Update 7
update xmlcustomer
set info = xmlquery('copy $new := $INFO
              modify do rename $new/customerinfo/addr as "address"
              return  $new' ) ;

The XML documents before update:

<customerinfo>
        <name>John Smith</name>
        <addr country="Canada">
                <street>Fourth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </addr>
        <phone type="work">963-289-4136</phone>
</customerinfo>

The XML documents after update:

<customerinfo>
        <name>John Smith</name>
        <address country="Canada">
                <street>Fourth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </address>
        <phone type="work">963-289-4136</phone>
</customerinfo>
<customerinfo>
        <name>David Patterson</name>
        <addr country="Canada">
                <street>Fifth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </addr>
        <phone type="work">222-222-2222</phone>
        <phone type="cell">333-333-3333</phone>
</customerinfo>
<customerinfo>
        <name>David Patterson</name>
        <address country="Canada">
                <street>Fifth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </address>
        <phone type="work">222-222-2222</phone>
        <phone type="cell">333-333-3333</phone>
</customerinfo>

Replace nodes

Update 8 illustrates how you can use the "replace" operation to replace the existing phone element with a new element. The "replace" operation works differently than the "replace value of" operation. The former replaces the whole node (the old node is deleted), whereas the latter replaces only the content of a node. You see in Update 8 that a whole new "phone" element is provided in the MODIFY clause of the update.

Update 8
update xmlcustomer
set info = xmlquery('copy $new := $INFO
                     modify do replace $new/customerinfo/phone with
                        <phone type="home">416-123-4567</phone> 
                     return  $new' )
where cid = 1000;

The XML document before update:

<customerinfo>
        <name>John Smith</name>
        <addr country="Canada">
                <street>Fourth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </addr>
        <phone type="work">963-289-4136</phone>
</customerinfo>

The XML document after update:

<customerinfo>
        <name>John Smith</name>
        <addr country="Canada">
                <street>Fourth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </addr>
        <phone type="home">416-123-4567</phone>
</customerinfo>

Replace the value of multiple nodes

You can certainly make multiple modifications to the same document in a single update statement. As you see in Update 9, the MODIFY clause contains multiple update operations. They are enclosed in parentheses and separated by a comma. This allows you to include two or more update operations of the same or different kinds. In Update 9, both update operations are of the type "replace value of." They replace the value of the phone element and the value of the @type attribute, respectively. Interestingly, the net effect of these two operations is the same as in Update 8.

Update 9
update xmlcustomer
set info = xmlquery(' 
              copy $new := $INFO
              modify (do replace value of $new/customerinfo/phone with "416-123-4567",
                      do replace value of $new/customerinfo/phone/@type with "home")
              return  $new' ) 
where cid = 1000;

The XML document before update:

<customerinfo>
        <name>John Smith</name>
        <addr country="Canada">
                <street>Fourth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </addr>
        <phone type="work">963-289-4136</phone>
</customerinfo>

The XML document after update:

<customerinfo>
        <name>John Smith</name>
        <addr country="Canada">
                <street>Fourth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </addr>
        <phone type="home">416-123-4567</phone>
</customerinfo>

Insert nodes

The "insert" operation allows you to add new nodes, such as elements or attributes, to an XML document. There are five ways in which you can specify the position of the new node in the document. Now assume you want to insert a new element into the document. Then you have the following insert options:

  • insert element1 into element2
    This means that element1 will be a child of element2. The position of element1 among the existing children of element2 is nondeterministic.
  • insert element1 as last into element2
    Means that element1 will be the last child of element2.
  • insert element1 as first into element2
    Means that element1 will be the first child of element2.
  • insert element1 after element2
    Means that element1 will be a sibling of element2 and will appear immediately after element2.
  • insert element1 before element2
    Means that element1 will be a sibling of element2 and will appear immediately before element2.

If you insert a new attribute instead of element1, then the operations "intoelement2 ", "as last into element2 " and "as first into element2 " all have the same effect, meaning the attribute will be added to element2 . Note that the XML data model does not define a positional order among the attributes of an element. Hence, last, first, before, and after do not affect the ordering of attributes. If you insert an attribute before or after element2 , the attribute will be added to the parent of element2.

In the options listed above, element1 could also be an expression that evaluates to a sequence of nodes. In that case, all of these nodes are inserted at the specified position. Also, element1 could be the root of an XML fragment with nested elements in case you wish to insert that entire fragment.

Now, look at some examples. Update 10 inserts a second phone number into the sample document. The "customerinfo" element is explicitly specified as the parent of the new "phone" element by saying "into $new/customerinfo".

Update 10
update xmlcustomer
set info = xmlquery('copy $new := $INFO
                     modify do insert <phone type="cell">777-555-3333</phone> 
                       into $new/customerinfo
                     return  $new' )
where cid = 1000;

The XML document before update:

<customerinfo>
        <name>John Smith</name>
        <addr country="Canada">
                <street>Fourth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </addr>
        <phone type="work">963-289-4136</phone>
</customerinfo>

The XML document after update:

<customerinfo>
        <name>John Smith</name>
        <addr country="Canada">
                <street>Fourth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </addr>
        <phone type="work">416-123-4567</phone>
        <phone type="cell">777-555-3333</phone>
</customerinfo>

In the result of Update 10, you see that the new phone "element" turned out to be the last child of "customerinfo." However, this position is guaranteed only if you explicitly specify "as last into", which is shown in Update 11.

Update 11
update xmlcustomer
set info = xmlquery('copy $new := $INFO
                     modify do insert<phone type="cell">777-555-3333</phone>
                             as last into $new/customerinfo
                     return  $new' )
where cid = 1000;

If you want the new cell phone to be the first of the phone elements in the document, you can explicitly request it to be inserted right before the first occurrence of any existing phone elements. This is done in Update 12.

Update 12
update xmlcustomer
set info = xmlquery( 'copy $new := $INFO
                      modify do insert <phone type="cell">777-555-3333</phone> 
                        before $new/customerinfo/phone[1]
                      return  $new' ) 
where cid = 1000;

The XML document before update:

<customerinfo>
        <name>John Smith</name>
        <addr country="Canada">
                <street>Fourth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </addr>
        <phone type="work">963-289-4136</phone>
</customerinfo>

The XML document after update:

<customerinfo>
        <name>John Smith</name>
        <addr country="Canada">
                <street>Fourth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </addr>
        <phone type="cell">777-555-3333</phone>
        <phone type="work">416-123-4567</phone>
</customerinfo>

Similarly, you could insert the new phone element right after the customer name, as shown in Update 13.

Update 13
update xmlcustomer
set info = xmlquery( 'copy $new := $INFO
                      modify do insert <phone type="cell">777-555-3333</phone> 
                        after $new/customerinfo/name
                      return  $new' ) 
where cid = 1000;

The XML document before update:

<customerinfo>
        <name>John Smith</name>
        <addr country="Canada">
                <street>Fourth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </addr>
        <phone type="work">963-289-4136</phone>
</customerinfo>

The XML document after update:

<customerinfo>
        <name>John Smith</name>
        <phone type="cell">777-555-3333</phone>
        <addr country="Canada">
                <street>Fourth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </addr>
        <phone type="work">416-123-4567</phone>
</customerinfo>

Update 14 uses the relational column value "CID" in the transform expression to produce a new element "customerid" as the first element under the root element "customerinfo."

Update 14
update xmlcustomer
set info = xmlquery( 'copy $new := $INFO
                      modify do insert <customerid>{ $CID } </customerid> 
                        as first into $new/customerinfo
                      return  $new' ) 
where cid = 1000;

The XML document before update:

<customerinfo>
        <name>John Smith</name>
        <addr country="Canada">
                <street>Fourth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </addr>
        <phone type="work">963-289-4136</phone>
</customerinfo>

The XML document after update:

<customerinfo>
        <customerid>1000</customerid>
        <name>John Smith</name>
        <addr country="Canada">
                <street>Fourth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </addr>
        <phone type="work">416-123-4567</phone>
</customerinfo>

If you prefer the "customerid" to be an attribute of the "customerinfo" element, then the insert expression needs to contain a so-called computed attribute constructor. It consists of the keyword attribute followed by the desired attribute name, plus an expression for the attribute value. You see this in Update 15.

Update 15
update xmlcustomer
set info = xmlquery(' 
                 copy $new := $INFO
                 modify do insert
                            attribute customerid { $CID }
                   into $new/customerinfo
                 return  $new' )
where cid = 1000;

The XML document before update:

<customerinfo>
        <name>John Smith</name>
        <addr country="Canada">
                <street>Fourth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </addr>
        <phone type="work">963-289-4136</phone>
</customerinfo>

The XML document after update:

<customerinfo customerid="1000">
        <name>John Smith</name>
        <addr country="Canada">
                <street>Fourth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </addr>
        <phone type="work">416-123-4567</phone>
</customerinfo>

Common pitfalls and their solutions

The update examples discussed above are all straight forward and intuitive. As you start to craft more complex updates, you may make some mistakes and encounter situations where DB2 rejects your update expressions. The following sections discuss common pitfalls and offer simple solutions.

Insert elements into the document node

The XML data model requires a parsed, well-formed XML document to have one document node that is the parent of the root element (customerinfo) of the document. This document node is not visible in the textual (serialized) representation of an XML document.

An easy mistake to make is to insert a new element into the document node instead of the root element. Update 16 is almost identical to Update 10 except that the insert is specified as "into $new" instead of "into $new/customerinfo." This means that Update 16 inserts the new phone element as a sibling and not as a child of the "customerinfo" element. The result is an XML sequence of two elements (customerinfo and phone) which is not a well-formed document. Since XML columns in DB2 can only contain well-formed XML documents, the update fails.

Update 16
update xmlcustomer 
set info = xmlquery( ' 
                 copy $new := $INFO
                 modify do insert <phone type="cell">777-555-3333</phone> into $new
                 return  $new' )
where cid = 1000;
Error Message:
SQL20345N The XML value is not a well-formed document with a single root element. 
SQLSTATE=2200L

Original XML document:

<customerinfo>
        <name>John Smith</name>
        <addr country="Canada">
                <street>Fourth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </addr>
        <phone type="work">963-289-4136</phone>
</customerinfo>

Rejected XML value:

<customerinfo>
        <name>John Smith</name>
        <addr country="Canada">
                <street>Fourth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </addr>
        <phone type="work">963-289-4136</phone>
</customerinfo>
<phone type="cell">777-555-3333</phone>

Modify repeating nodes

If a single XPath expression identifies multiple nodes in a single document, they are called repeating nodes. For example, the second document in this sample table contains two "phone" elements. Therefore, the expression /customerinfo/phone represents a sequence of two elements. The "delete" operation is the only operation that can act on a sequence of nodes — it deletes all of them. This is illustrated in Update 17.

Update 17
update xmlcustomer
set info = xmlquery( 'copy $new := $INFO
                      modify do delete $new/customerinfo/phone
                      return  $new' ) 
where cid = 1001;

The XML document before update:

<customerinfo>
        <name>David Patterson</name>
        <addr country="Canada">
                <street>Fifth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </addr>
        <phone type="work">222-222-2222</phone>
        <phone type="cell">333-333-3333</phone>
</customerinfo>

The XML document after update:

<customerinfo>
        <name>David Patterson</name>
        <addr country="Canada">
                <street>Fifth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </addr>
</customerinfo>

To delete only one of the phone elements, you can use a predicate in the MODIFY clause, such as modify do delete $new/customerinfo/phone[@type = "work"].

Update 18, attempts to replace the value of the phone elements instead of deleting them. But, at run time DB2 detects that there is more than one phone element and returns error SQL16085N. If you type ? SQL16085N at the DB2 command prompt you see a list of possible reasons for this error. The reason code "XUTY0008" shown in the error message quickly points you to the fact that "the target node of a replace expression is not a single node".

Update 18
update xmlcustomer
set info = xmlquery('copy $new := $INFO
                   modify do replace value of $new/customerinfo/phone with "444-444-4444"
                   return  $new')
where cid = 1001;
Error message:
SQL16085N  The target node of an XQuery "replace value of" expression is not valid. 
Error QName=err:XUTY0008.  SQLSTATE=10703

This error prevents you from updating all the phone elements with the same number, which probably wouldn't make sense. If you do need to update multiple nodes in the same manner, you can use an XQuery "for" expression to iterate over these nodes, as shown in Update 19.

Update 19
update xmlcustomer
set info = xmlquery( 'copy $new := $INFO
                      modify for $j in $new/customerinfo/phone return
                               do replace value of $j with "444-444-4444"
                      return  $new' ) 
where cid = 1001;

The XML document before update:

<customerinfo>
        <name>David Patterson</name>
        <addr country="Canada">
                <street>Fifth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </addr>
        <phone type="work">222-222-2222</phone>
        <phone type="cell">333-333-3333</phone>
</customerinfo>

The XML document after update:

<customerinfo>
        <name>David Patterson</name>
        <addr country="Canada">
                <street>Fifth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </addr>
        <phone type="work">444-444-4444</telephone>
        <phone type="cell">444-444-4444</telephone>
</customerinfo>

However, in many cases you will want to modify just one of multiple nodes. Update 20 uses a predicate in the MODIFY clause to only change the cell phone number.

Update 20
update xmlcustomer
set info = xmlquery( 'copy $new := $INFO
                      modify do replace value of $new/customerinfo/phone[@type = "cell"] 
                        with "444-444-4444"
                      return  $new' ) 
where cid = 1001;

The XML document before update:

<customerinfo>
        <name>David Patterson</name>
        <addr country="Canada">
                <street>Fifth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </addr>
        <phone type="work">222-222-2222</phone>
        <phone type="cell">333-333-3333</phone>
</customerinfo>

The XML document after update:

<customerinfo>
        <name>David Patterson</name>
        <addr country="Canada">
                <street>Fifth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </addr>
        <phone type="work">222-222-2222</phone>
        <phone type="cell">444-444-4444</phone>
</customerinfo>

Try to modify non-existing nodes

In the previous section, you learned that only the delete operation can be applied to a sequence of nodes, such as repeating elements. Other operations, such as rename and replace, need a "for" expression to iterate over the affected nodes, as shown in Update 19. The same is true if the target of an update operation is empty. For example, Update 21 tries to replace the value of a phone element where @type = "home". But, the sample document does not contain any home phone numbers, and the expression $new/customerinfo/phone[@type = "home"] produces an empty result. Hence, Update 21 fails. A "delete" operation that tries to remove the non-existent home phone would succeed but have no effect on the document.

Update 21
update xmlcustomer
set info = xmlquery('copy $new := $INFO
                     modify do replace value of $new/customerinfo/phone[@type = "home"]
                       with "777-777-7777"
                     return  $new') 
where cid = 1001;
SQL16085N  The target node of an XQuery "replace value of" expression is not 
valid. Error QName=err:XUTY0008.  SQLSTATE=10703

Again, reason code "XUTY0008" tells you that "the target node of a replace expression is not a single node." For Update 18, "not a single node" means that there was more than one node. Here, for Update 21, "not a single node" means that there is less than one node, meaning none. Again, a "for" iteration can solve this. You see this in Update 22, which succeeds but has no effect on the document.

Update 22
update xmlcustomer
set info = xmlquery( ' 
                    copy $new := $INFO
                    modify  for $j in $new/customerinfo/phone[@type = "home"] return
                               do replace value of $j with "777-777-7777"
                    return  $new' ) 
where cid = 1001;

The XML document before update:

<customerinfo>
        <name>David Patterson</name>
        <addr country="Canada">
                <street>Fifth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </addr>
        <phone type="work">222-222-2222</phone>
        <phone type="cell">333-333-3333</phone>
</customerinfo>

The XML document after update (same as before):

<customerinfo>
        <name>David Patterson</name>
        <addr country="Canada">
                <street>Fifth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </addr>
        <phone type="work">222-222-2222</phone>
        <phone type="cell">333-333-3333</phone>
</customerinfo>

Alternatively, you could use an XQuery if-then-else expression to conditionally update the home phone number or insert a new one if it doesn't exist yet. This is illustrated in Update 23.

Update 23
update xmlcustomer
set info = xmlquery(' 
                    copy $new := $INFO
                    modify  if ($new/customerinfo/phone[@type="home"]) then
                              do replace value of $new/customerinfo/phone[@type="home"] 
                                with "777-777-7777"
                            else do insert <phone type="home">777-777-7777</phone> 
                              as last into $new/customerinfo
                    return  $new' ) 
where cid = 1001;

The XML document before update:

<customerinfo>
        <name>David Patterson</name>
        <addr country="Canada">
                <street>Fifth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </addr>
        <phone type="work">222-222-2222</phone>
        <phone type="cell">333-333-3333</phone>
</customerinfo>

The XML document after update:

<customerinfo>
    <name>David Patterson</name>
    <addr country="Canada">
          <street>Fifth</street>
          <city>Calgary</city>
          <state>Alberta</state>
          <zipcode>M1T 2A9</zipcode>
   </addr>
   <phone type="work">222-222-2222</phone>
   <phone type="cell">333-333-3333</phone>
   <phone type="home">777-777-7777</phone>
</customerinfo>

Modify the same node multiple times

You have seen in Update 9 that you can include multiple update operations for the same document in the MODIFY clause of a single update statement. However, you cannot rename, replace, or change the value of the same node more than once. Typically this doesn't make sense anyway. An example is shown in Update 24. This update fails at runtime when DB2 detects that both "rename" operations are for the same phone element.

Update 24
update xmlcustomer
set info = xmlquery('copy $new := $INFO
                   modify(
                     do rename $new/customerinfo/phone[@type="work"] as "telephone",
                     do rename $new/customerinfo/phone[. ="222-222-2222"] as "telephone" )
                   return  $new')
where cid = 1001;
Error Message:
SQL16083N  Incompatible "rename" expressions exist in the modify clause of a 
Transform expression. Error QName=err:XUDY0015.  SQLSTATE=10704

You need to make sure that the predicates in the update operations select distinct nodes. This is illustrated in Update 25 which has two "replace value of" expressions on $new/customerinfo/phone. One of them applies to work phones, the other one to home phones, and this avoids error SQL16083N. "For" iterations are used similar to Update 22 so that the statement won't fail if a work phone or a home phone does not exist. This pattern is generally useful to deal with optional elements and schema diversity.

Update 25
update xmlcustomer
set info = xmlquery('
                 copy $new := $INFO
                 modify (for $x in $new/customerinfo/phone[@type="work"]
                           return do replace value of $x with "444-444-4444" ,
                         for $y in $new/customerinfo/phone[@type="home"]
                           return do replace value of $y with "555-555-5555")
                 return  $new') 
where cid = 1001;

The XML document before update:

<customerinfo>
        <name>David Patterson</name>
        <addr country="Canada">
                <street>Fifth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </addr>
        <phone type="work">222-222-2222</phone>
        <phone type="cell">333-333-3333</phone>
</customerinfo>

The XML document after update:

<customerinfo>
        <name>David Patterson</name>
        <addr country="Canada">
                <street>Fifth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </addr>
        <phone type="work">444-444-4444</phone>
        <phone type="cell">333-333-3333</phone>
</customerinfo>

Select nodes in the copy clause

So far, all the update examples in this article have used the same "copy" clause, that is copy $new := $INFO. You may wonder why you need to carry this piece of syntax around if it never changes. There can be situations where it's really handy to have the flexibility of the copy clause. Assume an application needs to retrieve information for a certain customer, but for privacy reasons it needs to exclude the customer's phone number. Update 26 and Update 27 do exactly that. While Update 26 uses an SQL predicate to select the document, Update 27 uses an XQuery predicate and avoids the use of SQL completely. These queries transform the document "on the fly" during query processing and do not change the document in the table.

Update 26
                            xquery
 copy $new := db2-fn:sqlquery("select info from xmlcustomer where cid = 1000")
modify do delete $new/customerinfo/phone
return $new;

Query Result:

<customerinfo>
  <name>John Smith</name>
  <addr country="Canada">
    <street>Fourth</street>
    <city>Calgary</city>
    <state>Alberta</state>
    <zipcode>M1T 2A9</zipcode>
  </addr>
</customerinfo>
Update 27
                            xquery
copy $new := db2-fn:xmlcolumn("XMLCUSTOMER.INFO")[/customerinfo/name="John Smith"]
modify do delete $new/customerinfo/phone
return $new;

Query Result:

<customerinfo>
  <name>John Smith</name>
  <addr country="Canada">
    <street>Fourth</street>
    <city>Calgary</city>
    <state>Alberta</state>
    <zipcode>M1T 2A9</zipcode>
  </addr>
</customerinfo>

The right-hand side of a COPY clause needs to produce exactly one item, such as one document. If there are multiple or zero customers by the name of "John Smith", then DB2 will raise the following error:

SQL16084N An assigned value in the copy clause of a transform expression is not a
sequence with exactly one item that is a node. Error QName=err:XUTY0013. SQLSTATE=10705

This error is avoided in Update 28 by iterating over the customers:

Update 28
                            xquery
for $i in db2-fn:xmlcolumn("XMLCUSTOMER.INFO")[/customerinfo/name="John Smith"]
return
        copy $new := $i
        modify do delete $new/customerinfo/phone
        return  $new;

Query Result:

<customerinfo>
  <name>John Smith</name>
  <addr country="Canada">
    <street>Fourth</street>
    <city>Calgary</city>
    <state>Alberta</state>
    <zipcode>M1T 2A9</zipcode>
  </addr>
</customerinfo>

As another example, Update 29 uses the COPY clause to only get the customer's address, and the RETURN clause to embed the modified address in a new "sendto" element.

Update 29
                            xquery 
for $i in db2-fn:xmlcolumn("XMLCUSTOMER.INFO")/customerinfo[name="John Smith"]
return
  copy $new := $i/addr
  modify do rename $new/zipcode as "postalcode"
  return <sendto>{$new}</sendto>;

The original XML document:

<customerinfo>
        <name>John Smith</name>
        <addr country="Canada">
                <street>Fourth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </addr>
        <phone type="work">963-289-4136</phone>
</customerinfo>

The query result:

<sendto>
        <addr country="Canada">
                <street>Fourth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <postalcode>M1T 2A9</postalcode>
        </addr>
</sendto>

When you use transform expressions in SQL update statements, you rarely have a filtering predicate in the COPY clause but rather in the SQL WHERE clause.

Combine multiple update operations

The emerging XQuery update standard specifies that all update operations in the MODIFY clause are applied independently from each other to the original document. They don't see each others' effects. This is called "snapshot semantics", meaning each update operation is logically applied to a separate snapshot of the original document. These semantics are illustrated in Update 30, which contains two update operations. The first one inserts an additional "phone" element. The second one renames all "phone" elements to "telephone." However, the rename only applies to phone elements in the original document, and not to the new phone element that is added in the same update statement. The order of the update operations in the MODIFY clause is irrelevant.

Update 30
update xmlcustomer
set info = xmlquery( 'copy $new := $INFO
                      modify (  do insert <phone type="cell">777-555-3333</phone> 
                                  after $new/customerinfo/addr ,
                                for $j in $new/customerinfo/phone
                                  return do rename $j as "telephone" )
                      return  $new' )
where cid = 1000;

The XML document before update:

<customerinfo>
        <name>John Smith</name>
        <addr country="Canada">
                <street>Fourth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </addr>
        <phone type="work">963-289-4136</phone>
</customerinfo>

The XML document after update:

<customerinfo>
        <name>John Smith</name>
        <addr country="Canada">
                <street>Fourth</street>
                <city>Calgary</city>
                <state>Alberta</state>
                <zipcode>M1T 2A9</zipcode>
        </addr>
        <phone type="cell">777-555-3333</phone> 
        <telephone type="work">963-289-4136</telephone>
</customerinfo>

As another example, consider a single update statement that deletes the "addr" element and also inserts a new element "POBox" into /customerinfo/addr. The new element "POBox" does not appear in the updated document because its parent element "addr" is deleted.

Updates are rejected if the resulting XML structure violates any fundamental XML rules. For example, an XML element cannot have two attributes with the same name. If you try to add an attribute to an element where another attribute with the same name already exists, DB2 rejects that update.


Summary

DB2 9.5 introduces the XQuery Update Facility that allows you to rename, insert, delete, replace, or change the value of individual elements and attributes within an XML document. This makes updating XML data easy and efficient. You can use this update functionality to change documents in an XML column without XML parsing or reading the document into your application. The examples in this article provide an overview of the update capabilities, and you can use them as samples to write your own XML updates.


Acknowledgements

We thank Don Chamberlin, Cindy Saracco, Henrik Loeser, Susanne Englert, and Mel Kiyama for their reviews and helpful comments that helped us improve this article.


Download

DescriptionNameSize
Script for queries in this articlesample.sql15KB

Resources

Learn

Get products and technologies

  • Now you can use DB2 for free. Download DB2 Express-C, a no-charge version of DB2 Express Edition for the community that offers the same core data features as DB2 Express Edtion and provides a solid base to build and deploy applications.
  • Build your next development project with IBM trial software, available for download directly from developerWorks.
  • Download a free trial version of DB2 Enterprise 9.
  • Download IBM product evaluation versions and get your hands on application development tools and middleware products from DB2®, Lotus®, Rational®, Tivoli®, and WebSphere®.

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Information management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management, XML
ArticleID=260831
ArticleTitle=Update XML in DB2 9.5
publish-date=10112007