Use XML in DB2 SQL stored procedures

Access XML data with procedural logic in DB2 9 for Linux, UNIX, and Windows

SELECT, INSERT, UPDATE and DELETE statements can all use the new IBM® DB2® 9 SQL/XML functions, which can be run in several different environments. The DB2 command line processor, third-party query tools, and SQL stored procedures are all compatible environments. Stored procedures add the ability to incorporate procedural logic constructs such as variables, IF/THEN/ELSE logic, looping, cursors and error trapping. Explore the interaction of XML data with procedural logic in this article. The short code examples and explanations provided will save you precious time on the road to becoming proficient with XML in SQL stored procedures.

Rex Oliva (roliva@us.ibm.com), DB2 Consultant , IBM Toronto Lab

Rex Oliva is an IBM DB2 Consultant with North American Lab Services in the Center of Excellence. He has worked with relational databases since the mid-90's, and with DB2 since 2000. He has extensive experience working with data warehouses, performing Extraction, Transformation, Loading (ETL), database design work, SQL tuning and rewrites, and assisting application developers on DB2.



18 January 2007

Also available in Chinese

Introduction

IBM® DB2® 9 for Linux, UNIX®, and Windows® adds major new XML functionality over and above prior DB2 releases. Before DB2 9, you could either store XML documents as LOBs or shred incoming XML data into relational tables, a process which required an add-on tool called XML Extender. However, DB2 9 enables users to store and query XML data in its native hierarchical format -- a technique referred to as pureXML™ support. Now, you can can build indexes over XML data, and perform shredding using a call to a stored procedure.

This article discusses the use of XML in SQL stored procedures. I provide numerous code examples to help demonstrate specific technical points. The examples provided are intended to aid your understanding, and therefore are as simplistic as possible. To achieve this goal, note that the examples are somewhat contrived.

Note: This article assumes that you're an application developer or DBA, who is already familiar with writing SQL stored procedures.

A brief introduction to XML

XML stands for "extensible markup language", which is a hierarchical method of representing information in a syntax very similar to HTML. An XML document is a tree structure consisting of "elements", or branches. Each element has a name, can contain a textual value, can have a list of attributes (in the format "name=value"), and can have children. Duplicate element names are allowed, although attribute names must be unique within an element.

XQuery is a query language for XML that is based on W3C standards. It is often used outside of any relational databases to query XML data, which may be stored in text files. XQuery allows you to access XML data (including joins), loops, declared variables, IF/THEN/ELSE statements, and other constructs.

There are four distinct ways to work with XML data in DB2 9:

  • Plain SQL
    • Plain SQL consists of regular SQL statements and functions that have no references to XPath or XQuery. Functionality is limited to working with XML documents as complete entities. They can be inserted into and retrieved from the database, and converted in or out of text. But there's no ability to parse out a portion of an XML document, nor apply any predicates.
  • SQL/XML with embedded XQuery
    • SQL/XML includes three new SQL functions which work with XML data and take embedded XQuery commands as arguments - XMLQUERY, XMLEXISTS and XMLTABLE.
      • XMLQUERY is used to extract from XML data. Combined with function XMLCAST, relational data can be derived from XML.
      • XMLEXISTS applies predicates to XML data and would often be used in a WHERE clause.
      • XMLTABLE extracts XML data in the form of a relational table.
  • XQuery
    • XQuery is now understood natively within the DB2 database engine. Two functions are provided to access DB2 data. The first, db2-fn:xmlcolumn, returns values of an XML column to the XQuery. All rows are included; there is no predicate.
  • XQuery with embedded SQL
    • The db2-fn:sqlquery function is also provided to access DB2 data and takes a SELECT statement as an argument. The SELECT can do anything you'd like, but must return a single column of type XML in the SELECT list.

Supported XML functionality in stored procedures

For DB2 9, SQL language stored procedures pick up new functionality in order to work with XML data. A new data type, "XML", is available. You can use it not only for columns in a table, but also for parameters and as declared variables. DB2 9 also adds functionality with enhancements to SELECT and UID statements, and several new XML functions.

This enhances SQL stored procedures in these ways:

  • You can use new XML functions (for example, XMLQUERY, XMLEXISTS, XMLTABLE):
    • In cursors
    • Anywhere SELECT syntax is allowed, such as SELECT INTO and VALUES
    • Variable assignments, such as SET statements
  • XML columns can be:
    • Returned in cursor result sets
    • Bound (? notation) into SELECT and UID statements
  • IF/THEN/ELSE statements can test XMLEXISTS predicates
  • Variables of type XML can be loaded from:
    • SELECT INTO statements
    • VALUES INTO statements
    • FETCH INTO statements
    • EXECUTE INTO statements
    • SET statements

Limitations and workarounds

There are some limitations for XML functionality within SQL stored procedures:

  • XQuery queries can only be run as part of dynamic cursors.
  • XML variables are not available after COMMIT or ROLLBACK.

XQuery in dynamic cursors

XQuery commands can appear in two different contexts:

  • Within single quotes as arguments to SQL/XML functions XMLQUERY, XMLEXISTS and XMLTABLE

    Embedded XQuery in SQL/XML functions is allowed in static SQL statements. In the following example, variable adrs is either an XML column or a variable of type XML.

    Listing 1. Embedded XQuery in SQL/XML functions
    XMLQUERY('$d/cust/name' PASSING adrs AS "d")

    Note: This example only contains a very simple XPath expression, but SQL/XML functions (XMLQUERY, XMLEXISTS, XMLTABLE) can contain any valid XQuery, including complex FLWOR expressions.

  • As a query

    XQuery queries can only be defined within dynamic cursors. The following is NOT allowed within a SQL stored procedure:

    Listing 2. Static XQUERY not allowed in a SQL stored procedure
    XQUERY 
    for $dept in db2-fn:xmlcolumn("DEPT.DEPTDOC")/dept
    where $dept/@deptID="PR27"
    return $dept/employee/name

XML variables after COMMIT or ROLLBACK

XML variables, both parameters and declared, are not available after COMMIT or ROLLBACK. Attempting to use XML variables, without first assigning new values to them, generates an error such as this:

Listing 3. Error message received after COMMIT/ROLLBACK
SQL1354N  An SQL variable in routine "XXX" is not available for reference due
 to a recent commit or rollback operation. SQLSTATE=560CE

If you need to explicitly control commit points within the stored procedure, consider these options:

  • Pass the variable's XML documents into a permanent table. Use a "permanent" table because declared global temp tables do not currently allow XML columns.
  • Serialize the XML variable into a VARCHAR, and parse it after the COMMIT or ROLLBACK.

Code examples

Let's explore how to work with pureXML data in DB2 stored procedures. To do so, we'll consider several examples that incorporate common programming patterns.

Generate test data

While testing stored procedures during development from the command line, you may need to create dummy XML documents to send as input parameters. Listing 4 provides a simple technique to do that:

Listing 4. Generate test data (#1)
CALL ex_proc
(
  XMLPARSE(
    DOCUMENT
    '
      <customer>
        <custid>HX25</custid>
      </customer>
    '
  )
)
;

The last example left a lot of blank spaces for clarity, so it's easy to line up the XML tags. But you can also compress the code like this:

Listing 5. Generate test data (2)
CALL ex_proc(XMLPARSE(DOCUMENT '<customer><custid>HX25</custid></customer>'));

Declare parameters

The new XML data type can be passed into a stored procedure as a parameter, in any mode (input, output, or both). When you declare the data type XML, no length is specified. That's a nice improvement over working with XML data as CLOBs or VARCHARs, which require maximum lengths.

Listing 6. Declare parameters
CREATE PROCEDURE process_cust
(
  IN     inpdoc   XML,
  OUT    outdoc   XML,
  INOUT  doc3     XML
)
LANGUAGE SQL
NO EXTERNAL ACTION
BEGIN
  ...
END

Declare variables

Variables can be declared of type XML. Just like with parameters, there is no length specified.

Listing 7. Declare variables
CREATE PROCEDURE ex_proc
LANGUAGE SQL
NO EXTERNAL ACTION
BEGIN
  DECLARE v_xml    XML;
  ...
END

Variable assignments

In addition to fetching data out of cursors, you can load values from a single row query into variables using three different techniques:

  • SET statement
  • SELECT INTO
  • VALUES INTO

All three of these techniques can use the SQL/XML functions, and specifically XMLQUERY. Subsequent examples within this section illustrate each of the techniques mentioned.

All of the following examples use the same in/out parameter list, the same input data, the same SQL/XML functions, and produce the same results. Three parameters are passed to a stored procedure. Parameter 1 is an input XML document. Parameter 2 is an output VARCHAR. Parameter 3 is an output XML document.

These examples:

  • Demonstrate the use of XMLCAST, which is heavily used when you extract XML data into variables of standard relational data types (for example, CHAR, INT)
  • Show that the output from XMLQUERY can be assigned to a variable of type XML
  • In function XMLQUERY, the clause "PASSING inpdoc" refers to a variable. If XMLQUERY is used within a SELECT statement, the PASSING clause can refer to either a table column or a stored procedure variable.
  • In the output from calling the stored procedure, you can see the difference between extracting an XML element, and extracting the value of an XML element.
  • Even though this example only assigns values to output parameters, it could just as easily use declared variables.
Listing 8. Sample data for input parm "inpdoc" for three variable assignment Examples

<customer> <name>Jed Clampett</name> <city>Beverly Hills</city> </customer>
Listing 9. Output for three variable assignment examples

Value of output parameters -------------------------- Parameter Name : P2 Parameter Value : Jed Clampett Parameter Name : P3 Parameter Value : <name>Jed Clampett</name> Return Status = 0

Variable assignment / SET statement

Let's first consider how to return the output of the XMLQuery function to variables declared in a stored procedure. Here, p2 casts the output into a VARCHAR value, while p3 returns output in XML form.

Listing 10. Variable assignment using SET statement
CREATE PROCEDURE ex_proc
(
  IN  inpdoc  XML,
  OUT p2      VARCHAR(30),
  OUT p3      XML
)
LANGUAGE SQL
NO EXTERNAL ACTION
BEGIN

  SET p2 =
        XMLCAST
        (
          XMLQUERY('$d/customer/name' PASSING inpdoc AS "d")
          AS VARCHAR(30)
        );

  SET p3 = XMLQUERY('$d/customer/name'PASSING inpdoc AS "d");

END

Variable assignment / SELECT INTO

This example is similar to our previous example. However, this version uses full SQL/XML SELECT statements to populate the variables p2 and p3.

Listing 11. Variable assignment using SELECT INTO
CREATE PROCEDURE ex_proc
(
  IN  inpdoc  XML,
  OUT p2      VARCHAR(30),
  OUT p3      XML
)
LANGUAGE SQL
NO EXTERNAL ACTION
BEGIN

  SELECT XMLCAST
         (
           XMLQUERY('$d/customer/name' PASSING inpdoc AS "d")
           AS VARCHAR(30)
         )
  INTO
         p2
  FROM
         sysibm.sysdummy1
  ;

  SELECT XMLQUERY('$d/customer/name' PASSING inpdoc AS "d")
  INTO
         p3
  FROM
         sysibm.sysdummy1
  ;

END

Variable assignment / VALUES INTO

You can also populate the variables p2 and p3 using the VALUES clause, as shown in the following example.

Listing 12. Variable assignment using VALUES INTO
CREATE PROCEDURE ex_proc
(
  IN  inpdoc  XML,
  OUT p2      VARCHAR(30),
  OUT p3      XML
)
LANGUAGE SQL
NO EXTERNAL ACTION
BEGIN

  VALUES XMLCAST
         (
           XMLQUERY('$d/customer/name' PASSING inpdoc AS "d")
           AS VARCHAR(30)
         )
  INTO
         p2
  ;

  VALUES XMLQUERY
         (
           '$d/customer/name' PASSING inpdoc AS "d"
         )
  INTO
         p3
  ;

END

Casting errors

The XMLCAST function can raise errors if the target data type is incompatible with the actual data. For example, you can't convert the string "HX25" into an integer. A stored procedure can catch casting errors, as Lisitngs 13 through 15 illustrate:

Listing 13. Casting errors - Sample data for input parm "inpdoc"
  <customer>
    <custid>HX25</custid>
  </customer>
Listing 14. Casting errors

CREATE PROCEDURE ex_proc ( IN inpdoc XML ) LANGUAGE SQL NO EXTERNAL ACTION BEGIN DECLARE v_int INTEGER; DECLARE v_badnum CHAR(1) DEFAULT 'N'; DECLARE CONTINUE HANDLER FOR SQLSTATE '10608' SET v_badnum = 'Y'; SET v_int = XMLCAST ( XMLQUERY('$d/customer/custid' PASSING inpdoc AS "d") AS INT ); IF (v_badnum = 'Y') THEN SIGNAL SQLSTATE '75002' SET MESSAGE_TEXT = 'Bad number'; END IF; END
Listing 15. Casting errors - Output

SQL0438N Application raised error with diagnostic text: "Bad number". SQLSTATE=75002

IF/THEN/ELSE statements and XMLEXISTS

You can use XMLEXISTS to test XML-type variables within an IF/THEN/ELSE statement.

Listing 16. Code sample involving IF/THEN/ELSE
CREATE PROCEDURE ex_proc(IN inpdoc XML)
LANGUAGE SQL
NO EXTERNAL ACTION
BEGIN

  IF XMLEXISTS('$d/request[@action="add"]' PASSING inpdoc AS "d") THEN
    SIGNAL SQLSTATE '75002' SET MESSAGE_TEXT = 'F(x) add';
  ELSEIF XMLEXISTS('$d/request[@action="update"]' PASSING inpdoc AS "d") THEN
    SIGNAL SQLSTATE '75002' SET MESSAGE_TEXT = 'F(x) update';
  ELSE
    SIGNAL SQLSTATE '75002' SET MESSAGE_TEXT = 'F(x) invalid';
  END IF;

END
Listing 17. IF/THEN/ELSE, input values (1)
<request action="add">
  <name>Jean Luc Picard</name>
</request>

Output:
  SQL0438N  Application raised error with diagnostic text: "F(x) add".  
  SQLSTATE=75002
Listing 18. IF/THEN/ELSE, input values (2)

<request action="update"> <name>Jean Luc Picard</name> </request> Output: SQL0438N Application raised error with diagnostic text: "F(x) update". SQLSTATE=75002
Listing 19. IF/THEN/ELSE, input values (3)

<request action="delete"> <name>Jean Luc Picard</name> </request> Output: SQL0438N Application raised error with diagnostic text: "F(x) invalid". SQLSTATE=75002

PASSING clause

The PASSING clause of the XMLQUERY, XMLEXISTS and XMLTABLE functions can pass in multiple variables. The variables don't have to be limited to just XML data types. The next two examples show a couple of interesting things you can do.

Listing 20. Test attribute values
CREATE PROCEDURE ex_proc
(
  IN  inpdoc    XML,
  IN  search    VARCHAR(20),
  OUT id        VARCHAR(30)
)
LANGUAGE SQL
NO EXTERNAL ACTION
BEGIN

  SET id =
        XMLCAST(
          XMLQUERY('$d/customers/customer[@id=$a]/name' PASSING inpdoc AS "d",
                                                                search AS "a"
          )
          AS VARCHAR(30)
        )
  ;

END

Here is another example of how you can use the PASSING clause of XMLQUERY (and the other SQL/XML functions). If you need to write generic code in order to access XML data that has been formatted in a variety of ways, it can be useful to pull out nodes and attributes by position (first, second, third, and so on).

Sample data for input parm "inpdoc"

Listing 21. Extract positional nodes and attributes - Input data
<order id="333" status="open">
  <customer>
    <custid>11029></custid>
    <name>Johnny Depp</name>
    <city>Hollywood</city>
  </customer>
</order>
Listing 22. Extract positional nodes and attributes
CREATE PROCEDURE ex_proc
(
  IN  inpdoc    XML,
  IN  attrpos   INT,
  OUT attrname  VARCHAR(30),
  OUT attrvalue VARCHAR(30),
  IN  nodepos   INT,
  OUT nodename  VARCHAR(30),
  OUT nodevalue VARCHAR(30)
)
LANGUAGE SQL
NO EXTERNAL ACTION
BEGIN

  VALUES
  (
    XMLCAST(
      XMLQUERY('$d/order/@*[$a]/name()' PASSING inpdoc AS "d", attrpos AS "a")
      AS VARCHAR(30)
    ),
    XMLCAST(
      XMLQUERY('$d/order/@*[$a]' PASSING inpdoc AS "d", attrpos AS "a")
      AS VARCHAR(30)
    ),
    XMLCAST(
      XMLQUERY('$d/order/customer/*[$n]/name()' PASSING inpdoc AS "d", nodepos AS "n")
      AS VARCHAR(30)
    ),
    XMLCAST(
      XMLQUERY('$d/order/customer/*[$n]' PASSING inpdoc AS "d", nodepos AS "n")
      AS VARCHAR(30)
    )
  )
  INTO
         attrname,
         attrvalue,
         nodename,
         nodevalue
  ;

END
Listing 23. Extract positional nodes and attributes - Output

Output Value of output parameters -------------------------- Parameter Name : ATTRNAME Parameter Value : status Parameter Name : ATTRVALUE Parameter Value : open Parameter Name : NODENAME Parameter Value : city Parameter Name : NODEVALUE Parameter Value : Hollywood Return Status = 0

Restructure an XML document

This example demonstrates one way to restructure an XML document. The original document is passed in as an input parameter, and the new document is passed out as an output parameter.

Listing 24. Input and desired output
<customer>                       <customer @id="xxx">
  <id> ... </id>                   <name>
  <name>                             <first> ... </first>
    <first> ... </first>             <last> ... </last>
    <last> ... </last>             </name>
  </name>                          <city> ... </city>
  <city> ... </city>             </customer>
</customer>

In Listing 24, a node is being converted into an attribute. That's a very small change, but you can apply the same technique to much more complicated cases. The basic idea is to shred the input document into its individual components. If a component is not going to change, such as the <name> element in this example, leave it as type XML, and don't do any further parsing. If the restructure involves moving things between attributes and nodes, you'll always need to cast the data into relational data types.

Listing 25. Restructure an XML document

CREATE PROCEDURE ex_proc ( IN inpdoc XML, OUT outdoc XML ) LANGUAGE SQL NO EXTERNAL ACTION BEGIN DECLARE v_id INT; DECLARE v_name XML; DECLARE v_city XML; VALUES ( XMLCAST ( XMLQUERY('$d/customer/id' PASSING inpdoc AS "d") AS INT ), XMLQUERY('$d/customer/name' PASSING inpdoc AS "d"), XMLQUERY('$d/customer/city' PASSING inpdoc AS "d") ) INTO v_id, v_name, v_city ; SET outdoc = XMLQUERY ( '<customer id="{$id}"> {$name} {$city} </customer> ' PASSING v_id AS "id", v_name AS "name", v_city AS "city" ) ; END
Listing 26. Input and output values
Sample data for input parm "inpdoc"

<customer>
  <id>3005302</id>
  <name>
    <first>Miguel</first>
    <last>Rodriguez</last>
  </name>
  <city>Wichita</city>
</customer>

Output:

<customer id="3005302">
  <name>
    <first>Miguel</first>
    <last>Rodriguez</last>
  </name>
  <city>Wichita</city>
</customer>

Summary

In this article, we've explored how to capitalize DB2 9's new XML capabilities within SQL stored procedures. There's a new XML data type in this version, which you can access through SQL/XML functions and XQuery. The examples in this article illustrated how to use an XML variable for declaring, assigning, and testing values. You've learned how to generate test data and restructure an existing XML document, and hopefully this proves relevant to your real-world scenarios. My hope is that this article arms SQL developers with the information they need to incorporate XML into applications.

Glossary of terms

CURSOR: A programming technique that allows the application to retrieve a query's results, row by row. Normally, there is a loop that continues to fetch each row until the end of the data is reached.

DYNAMIC SQL: All queries are classified as either dynamic or static. "Dynamic" means that the query is sent into the database engine as a string; it must be parsed and have a query execution plan generated. Anything run from the command line processor (CLP) is always dynamic.

PARAMETER MARKER: An application may wish to call the same query multiple times, only changing specific values each time (ex: parameters sent to a stored procedure, or a value in a WHERE clause). A parameter marker is a way to accommodate this. The query uses a "?" wherever it does not want to hard code a value. The query is prepared, which locks down the execution plan. Every time the query is executed, variables are supplied which get mapped into the "?" parts of the query.

STATIC SQL: All queries are classified as either dynamic or static. "Static" only exists within applications, such as SQL stored procedures, and embedded SQL in applications (such as Java's SQLJ). Static queries have already been parsed, and had a query plan generated. The plan is stored within the database as part of a "package". When a static query is executed, the previously computed query plan is used.

UID: An acronym for a SQL statement of type UPDATE, INSERT or DELETE. Since all three statements modify data, it's useful to have an acronym for referring to all of them.

VALUES INTO: It's useful to be able to run a query for exactly one row. In DB2, you can select from table SYSIBM.SYSDUMMY1, which contains exactly 1 row. You can also use the syntax "VALUES". It's useful for queries that don't read from any tables or views. VALUES INTO fetches the values into variables.

XML ATTRIBUTE: In this XML example -- <cust id="725"></cust> -- there is an XML attribute called "id" which has a value of "725". Attributes belong to nodes -- in this case, a node called "cust". Attributes must have unique names, unlike nodes.

XML NODE: An XML node is everything between two tags of this form: <cust> blah blah blah </cust>. A node can contain text, attributes, other nodes (hence, the hierarchical nature of XML), and a few other specialized things (such as processing instructions and comments).

XMLCAST: A new SQL function that converts an XML node's value into a relational data type, such as VARCHAR, INTEGER, DATE, and so on.

XMLEXISTS: A new SQL function which tests an XML document for some condition. Conditions are expressed with the syntax of XQuery. They can include things like the existence of a particular node, a node having a particular value, presence of a particular attribute, attribute having a particular value, etc.

XMLQUERY: A new SQL function which extracts something from an XML document. That "something" can be a node, a sequence of nodes, or XQuery functions like count().

XMLTABLE: a new SQL function which is designed to view some portion of an XML document as a relational table.

XQUERY: The native language for querying XML documents. The simplest part of XQuery is called "XPath". It looks like a fully qualified UNIX filename (such as /abc/xxx/305xk), with a syntax for expressing predicates.

Acknowledgments

I want to thank all of the people who reviewed this paper and provided me with valuable feedback (in alphabetical order): Merla Black, Eugene Firyago, Bill Gates, Strother Hollingsworth, Matthias Nicola, Don Payne, Cindy Saracco, and Hardeep Singh.

Resources

Learn

Get products and technologies

  • Download a free trial version of DB2 Enterprise 9.
  • Now you can use DB2 for free. Download DB2 Express-C, a no-charge version of DB2 Express Edition for the community that offers the same core data features as DB2 Express Edtion and provides a solid base to build and deploy applications.
  • Download IBM product evaluation versions and get your hands on application development tools and middleware products from DB2, Lotus®, Rational ®, Tivoli®, and WebSphere®.
  • Build your next development project with IBM trial software, available for download directly from developerWorks.

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Information management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management, XML
ArticleID=188837
ArticleTitle=Use XML in DB2 SQL stored procedures
publish-date=01182007