DB2 9 Application Developer Certification Exam 733 prep, Part 3: XML data manipulation

Store and retrieve XML through your application

Learn how IBM® DB2® parses XML, handles whitespace, and serializes XML, and see how document encoding and client data type affect parsing and serialization. Learn, also, how DB2 validates XML against an XML schema as well as how to use SQL/XML functions to "shred" XML to relational data, assemble relational into XML, and publish XML as relational data.

Donald E. Payne (payned@us.ibm.com), Advisory IT Specialist, IBM

Donald PayneDonald Payne is an Advisory IT Specialist with IBM. He has consulted for customers and taught computer classes on relational databases and extensibility. He first worked with XML in 2001. He has assisted early customers using the pureXML features of DB2 V9.



16 March 2007

Also available in Vietnamese

Before you start

About this series

The IBM Certified Application Developer certification confirms to others that you are an intermediate- or advanced-level IBM DB2 for Linux®, UNIX®, and Windows® application developer and shows that you have strong skills in all common programming tasks as well as embedded SQL programming, ODBC/CLI programming, .NET programming, or Java™ programming.

This series of nine free tutorials is designed to help you prepare for the DB2 9 Application Development for Linux, UNIX, and Windows certification exam (Exam 733). Each tutorial includes a link to a free DB2 9 for Linux, UNIX, and Windows trial download. These tutorials provide a solid base for each section of the exam. However, you should not rely on these tutorials as your only preparation for the exam.

What is this tutorial about?

This tutorial teaches you how DB2 and a DB2 client application manipulate XML data. You'll learn:

  • How DB2 handles whitespace on input
  • How DB2 determines the encoding of an XML document on input and output
  • How DB2 can validate XML on input
  • How to execute an XQuery or XPath and identify its results
  • How to decompose, or shred, XML input to relational rows
  • How to publish relational data as XML
  • How to publish XML data as relational

This is the third in a series of nine tutorials that is designed to help you prepare for the DB2 9 Application Developer Certification exam (Exam 733). The material in this tutorial covers the objectives in Section 3 of the exam, titled "XML data manipulation."

Objectives

After completing this tutorial, you should be able to write applications that store and retrieve XML.

Prerequisites

This tutorial is written for DB2 developers who are familiar with the following concepts:

  • XML, including:
    • Elements
    • Attributes
    • Documents
    • Well-formed documents
    • XML declarations
    • Namespaces
  • XPath, including the XML Data Model (XDM):
    • Nodes
    • Atomic values
    • Items
    • Sequences
    • Axis: Parent, child, descendant, and so on
    • Steps
    • Wild cards
    • Predicates
  • XQuery:
    • The db2-fn functions xmlcolumn(), sqlquery()
    • FLWOR expression: for, let, where, order by, return
    • prolog
  • XML Schema, validation, and namespaces
  • The concepts of character encoding and Unicode

For the code examples, you should be familiar with:

  • The programming language
  • Any interfaces or libraries used, such as CLI and JDBC
  • The data types that the language supports and how they map to DB2 SQL types

See the Resources section for links to this information.

System requirements

To complete this tutorial, you need the following:

  • A computer running Linux, UNIX, or Windows, with DB2 9 FixPack 1 installed
  • An editor, such as Rational Application Developer, Windows Notepad, or vi (see the Resources section of this tutorial for more information)
  • A compiler to run code examples. The DB2 Information Center page on supported programming languages and compilers for database application development links to pages listing supported compilers in all supported languages:
    • Supported C compilers: See the Resources section, subsection "Get products and technologies."
    • For the Java language, DB2 on Windows comes with a Java 2 SDK, Version 5, including the compiler javac and runtime environment (JRE), under SQLLIB\java\jdk. (On most platforms, DB2 9 supports Java 2 SDK, Versions 1.4.2 through 5. See DB2 Information Online, "Supported Java application development software," for details.) The JDK includes the JRE under the SQLLIB\java\jdk\jre directory. Your PATH environment needs to include the bin directories under jdk; CLASSPATH needs to include some JAR and ZIP files under SQLLIB\java.
  • A Web browser is handy for viewing an XML file, checking that it is well-formed, and finding mistakes

Running the examples

If you run the SQL examples from this tutorial in the DB2 Command Line Processor (CLP) db2, see the section titled "DB2 Command Line Processor (CLP) behavior and options."


Processing whitespace in XML parsing and validation

XML changes upon parsing (input)

When DB2 or another XML processor parses incoming XML — that is, converts it from characters to an internal structure — it makes certain changes to the XML in the process, as specified in the XML standard. Some of these changes matter if you are concerned about the storage or memory requirements of XML in DB2 or in the client application. It might also matter if you are concerned with text nodes that contain only whitespace, the appearance of the XML upon retrieval and serialization back to characters, or digital signatures.

The parser processes XML input as follows:

  1. The parser determines the encoding. If the encoding of the source is not equivalent to DB2's internal encoding of UTF-8, the source characters are transcoded, or converted from the source encoding to DB2's. See the section "XML Encoding," subsection "Which encoding?" for more information.
  2. End-of-line characters are the carriage return (CR, hex code 0xD) and line feed (LF, hex code 0xA). Different operating systems (DOS/Windows, UNIX/Linux, Mac OS, or OS X) follow different conventions for ending lines. These characters are normalized, or converted to a single form, LF or 0xA.
  3. The parser checks for a well-formed document
  4. XML entities include predefined entities such as & and " and entities defined in a DTD. Entity references in the document are expanded or converted to the value of the entity. (The expanded character codes of the predefined entities are smaller than the original entity or reference.) Any internal DTD is discarded.
  5. Whitespace is a catch-all term for certain non-printing characters, commonly:
    • space (hex code 0x20)
    • tab (hex 0x9)
    • End of line characters (above)

    Boundary whitespace may be stripped based on the rules you'll read about in the "Parsing and boundary whitespace" section below.

  6. XML character references such as € (€, the euro symbol) are expanded or converted to the actual character code in UTF-8. As with predefined entity references, the expanded character codes are usually smaller than the original reference.
  7. If validating, ignorable whitespace is stripped. See the section titled "Validation and ignorable whitespace" below for more information.

XML comments (<!-- comment -->) are not stripped.

See the Downloads section for sample code that demonstrates how parsing and validation change whitespace.

Tip: The DB2 CLP options -n and -q can affect whitespace in string literals before they get to DB2. See the section titled "DB2 Command Line Processor (CLP) behavior and options" for more information.

Parsing and boundary whitespace

When you create XML, it often includes boundary whitespace to make it "pretty" and more readable. Boundary whitespace in an XML document is text containing only whitespace at the boundary between two XML tags, with no non-whitespace characters.

Listing 1. Sample XML document for whitespace processing
<customerinfo xmlns="http://posample.org" Cid='  1018  ' >
    <name>     </name>
    <addr country="  Canada
    ca  " xml:space="preserve">
        <street>     </street>
        <city>     </city>
        <prov-state>     </prov-state>
        <pcode-zip>     </pcode-zip>
    </addr>
    <phone type="   ">  416-555-1358  </phone>
</customerinfo>

In Listing 1, there is boundary whitespace between each tag and the following tag. For example, there's whitespace:

  • Between the starting tag <customerinfo Cid=' 1018 '> and the starting tag <name>
  • Between the starting tag </name> and the ending tag </name>
  • Between the ending tag </phone> and the ending tag </customerinfo>

If parsing does not strip this whitespace, a parsed "pretty" document will have many more elements with mixed content — a mixture of child elements and text nodes — and many text nodes with nothing but whitespace.

An attribute value is not boundary whitespace, even if it's all whitespace, such as type. In Listing 1, the whitespace around the number 1018, the words "Canada ca" and the phone number "516-555-1358" may be bad data, but it is not boundary whitespace.

Here are the rules for stripping or preserving boundary whitespace, in descending order of priority:

  1. If the XML contains an element with the attribute xml:space="preserve", the parser preserves boundary whitespace in that element and any child elements
  2. If you insert XML with an explicit XMLPARSE() and specify either the PRESERVE WHITESPACE or STRIP WHITESPACE option, the parser preserves or strips boundary whitespace as directed
  3. If you insert XML with explicit XMLPARSE() and do not specify either the PRESERVE WHITESPACE or STRIP WHITESPACE option, then boundary whitespace is stripped because that is the default for XMLPARSE()
  4. An implicit parse occurs when you insert XML as a string literal, host variable, or parameter marker without XMLPARSE() or XMLVALIDATE(). (XMLVALIDATE() performs a validating parse, which is effectively explicit and ignores this rule.) In an implicit parse in a CLI application:
    • You may issue the command SET CURRENT IMPLICIT XMLPARSE OPTION = 'value' during your session, where value is either PRESERVE WHITESPACE or STRIP WHITESPACE. If you do, then the parser preserves or strips boundary whitespace accordingly.
    • Otherwise, if you modify your db2cli.ini file to include the line

      CurrentImplicitXMLParseOption="parse-option"

      in a [section] named after your database ([database-name]), then the parser preserves or strips boundary whitespace for that database accordingly. parse-option is either PRESERVE WHITESPACE or STRIP WHITESPACE. For example:

      [sample]
      CurrentImplicitXMLParseOption="PRESERVE WHITESPACE"

      The db2cli.ini file is located in %DBPATH%. The section name, parameter name, and value are not case-sensitive. You can add this entry by either editing the file directly, or by the DB2 statement:

      UPDATE CLI CONFIGURATION FOR SECTION section-name
      USING CurrentImplicitXMLParseOption '"parse-option"';
  5. The parser strips boundary whitespace
Listing 2. Example of Rule 5: Insert XML with empty element, implicit parse, strip whitespace
connect to sample
insert into catalog values ('emptyelement', '<a> </a>')
DB20000I  The SQL command completed successfully.

select catlog from catalog where name = 'emptyelement'
<a/>
Listing 3. Example of Rule 2: Insert XML with empty element, explicit parse, preserve boundary whitespace
insert into catalog values ('preservespace', xmlparse(document '<a>  </a>' 
preserve whitespace))
DB20000I  The SQL command completed successfully.

xquery db2-fn:sqlquery('select catlog from catalog where 
name = ''preservespace'' ')/a
<a>  </a>

Validation and ignorable whitespace

When you input XML with XMLVALIDATE(), DB2 parses the document, then validates the parsed document against an XML Schema Definition (XSD). (In some cases, DB2 can parse and validate in one step with a validating parser instead of the default nonvalidating parser. But the processing described in "Parsing and boundary whitespace" still occurs first.) Validation always removes ignorable whitespace, which is whitespace that meets one of the following criteria:

  • An element-only complex type is an element defined in the XSD to have only child elements and no text nodes. Whitespace between child elements in an element-only complex is ignorable. This kind of ignorable whitespace is also boundary whitespace, which can make the apparent behavior of parsing and validation confusing.
  • If a simple element (containing only a text node) or an attribute is defined in the XSD as a non-string type (as xs:integer, for example), then whitespace in that element or attribute value is ignorable

If validation did not ignore this whitespace, a "pretty" document would fail validation because it contains text nodes where none are allowed and contains non-numeric characters in numeric values. Whitespace that is part of a defined text node under an element or part of the value of an attribute of type xs:string is not ignorable and is not stripped.

For example, take another look at Listing 1 above. According to the XSD, the only string types are:

  • Text nodes under the elements name, street, city, prov-state, pcode-zip, and phone
  • The attributes country and type

There is ignorable whitespace in the following places:

  • Between one starting tag and an immediately following starting tag — for example, between <customerinfo Cid=' 1000 '> and <name>
  • Between one ending tag and an immediately following starting tag — for example, between </name> and <addr ...>
  • Between one ending tag and an immediately following ending tag — for example, between </phone> and </customerinfo>)
  • The XSD defines attribute Cid as an integer:

    <xs:attribute name="Cid" type="xs:integer" />

    So the whitespace around "1000" is ignorable and will be stripped.

Whitespace in the elements name, street, city, prov-state, pcode-zip, and phone is not ignorable.

Tips:

  • Some of the sample scripts you can find at SQLLIB\samples\xml\db2sample_xml.db2 in your DB2 install insert XML like this:

    XMLVALIDATE( XMLPARSE ( DOCUMENT '...' PRESERVE WHITESPACE ) ...)

    For example:

    INSERT INTO CUSTOMER ( Cid, Info ) VALUES ( 
        1000,XMLVALIDATE( XMLPARSE ( DOCUMENT
        '<customerinfo xmlns="http://posample.org" Cid=''1000''>...'
        PRESERVE WHITESPACE ) ...)

    XMLPARSE() preserves boundary whitespace; but XMLVALIDATE() strips ignorable whitespace.

  • The DB2 CLP can strip whitespace in string literals, including XML, before sending to DB2. See the section of this tutorial titled "DB2 Command Line Processor (CLP) behavior and options" for more information.
  • The DB2 CLP is not a CLI application, so the CLI options do not apply to it.

XML serialization

XML changes upon output

When DB2 serializes XML, or converts it from its internal structure back to characters, as with parsing, the XML standard specifies several changes to the data, so it may look different from what was input. The purpose and effect of these changes is to produce a valid XML document-string that can in turn be parsed again.

  • If the target encoding does not equal DB2's internal encoding of UTF-8, the UTF-8 characters are transcoded to the target. Characters that were character references on input remain as the actual character on output; they are not re-encoded as a reference. Characters not in the target character set are lost, replaced with a character such as "?". See the section of this tutorial titled "XML encoding" for more information.
  • An XML declaration is generated if any of the following is true:
    • You serialize explicitly with XMLSERIALIZE(XML-expression INCLUDING XMLDECLARATION)
    • The client is a CLI or embedded SQL (ESQL) application
    • The client is a Java or .NET application, using the DB2 Driver for JDBC and SQLJ, that retrieves XML into a DB2Xml object using the com.ibm.db2.jcc.DB2Xml class.
  • A Byte Order Mark (BOM) is generated if the target encoding is UTF-16.

  • The ampersand (&), less than (<), and greater than (>) characters are replaced with their XML entities (&amp;, &lt;, and &gt;, respectively).

  • Within attributes, double quotes (") are replaced with their XML entity (&quot;).

  • Within attribute values, certain whitespace characters — such as CR, LF, and tab — that were not normalized out of the XML upon input are replaced with their numeric reference, &#n;, where n is the hex code.

  • Because internal DTDs are discarded upon input, characters that were expanded from an entity defined in a DTD are not re-encoded as an entity reference, but remain in their expanded form.

  • An empty element is one with no content— that is, with no text or other nodes between its starting and ending tags. When serialized upon output, it will appear as a single empty element tag,<tag-name/>.

Tip: the DB2 CLP option -i "pretty-prints" the output of an XQuery, adding line breaks and indentation. The option -d generates an XML declaration at the head of the document. See the section of this tutorial titled "DB2 Command Line Processor (CLP) behavior and options" for more information.


XML encoding

Character encoding

Historically, the terms character set, character encoding, and code page have all had a similar meaning: a set of characters and a set of binary codes, where each code represents a character. (Code page is a term from IBM for a for a character set on a mainframe or IBM PC.) The official character set names are maintained by the Internet Assigned Numbers Authority (IANA); see the Resources section for more information.

Legacy character encodings

Some common legacy character sets in the Western world are US-ASCII, EBCDIC, code page 437, code page 1252, 8859-1, 1208, and Latin-1. These all encode one character as one byte.

  • Code page 437: On Microsoft Windows in the USA, the default code page is 437; the IANA official encoding name is ibm-437. It has a few non-English characters (for example, the "ae" ligature, lower-case vowels with accents, some Greek letters, French quotes ("<<" and ">>"), and 48 graphic characters for shading and boxes (a legacy of DOS and character-based programs).

  • Code page 1252: DB2 has two items in its Start menu group under Command Line Tools: the Command Line Processor and Command Window. Both start off by running a program db2clpcp.exe, which changes the code page from the default 437 to (again, in the USA) 1252. Code page 1252, which Microsoft also calls "Latin-I", has the "ae" ligature, upper- and lower-case accented characters, and the euro symbol (hex 0x80), but no Greek letters.

  • Code page 1208: When you CREATE DATABASE USING CODESET UTF-8, the corresponding database code page is 1208:

    db2 get db cfg for sample | grep "code"
     Database code page                                      = 1208
     Database code set                                       = UTF-8
     Database country/region code                            = 1
  • Character set iso-8859-1: This character set, also called Latin-1, is common in English or Western European Web pages. It has the "ae" ligature, upper- and lower-case accented Latin letters, no Greek letters, and no euro sign.

See Resources for links to definitions of these and other character sets.

Unicode

Any single legacy encoding is limiting because it can represent text in only a small set of languages. Managing multiple encodings is a headache for many reasons, not least because most applications and databases were designed to handle only one. Unicode was invented to solve this problem. It is a single character set that represents all characters in virtually all languages in use, with room to grow.

With Unicode, the simple idea of a character set was refined into four concepts:

  • A character repertoire is merely the set of all characters that can be encoded; it does not specify the encoding.
  • A coded character set is a set of unique integers to represent some or all of the characters in a character repertoire, one number (or code point) to one character. Also known as character set, charset, or code set. It is synonymous with code page. Unicode code points are in the range of 0 to 0x10FFFF (about 1.1 million).
  • Originally, Unicode defined a repertoire of fewer than 64,000 characters, so it could specify a coded character set with 16-bit (2-byte) code points. With the addition of more alphabets, the character repertoire exceeded 64,000. Another issue was that many computer systems (notably programs written in C) that process US-ASCII or others use a null or zero byte (\0 in C) to mark the end of a character string. But encoding Unicode code points directly as 16-bit or larger numbers would include some null bytes, which would break these programs.

    So Unicode introduced a new concept, the character encoding form (CEF), often shortened to character encoding or simply encoding. A CEF is a mapping from the code points in the coded character set to a different set of integers (or code units) in which the characters will actually be encoded. The code unit is the smallest unit used for encoding characters. A single character may be represented by one or more code units, allowing more characters to be encoded.

    So the code points are merely for reference, so you can say "that character." The actual bytes in your Java String, C char[], or file may be different, depending on your character encoding form. Some Unicode character encoding forms are UTF-8, UTF-16, and UTF-32. The numbers 8, 16, and 32 denote the number of bits in the code unit.

    • UTF-8 (which DB2 uses) uses 1 to 4 code units (bytes, in this case) per character. This is important when sizing [VAR]CHAR columns in a UTF-8 database: one character may take up 1 to 4 bytes. The UTF-8 character encoding form does not contain any null bytes, which is handy for C.
    • In the UTF-16 character encoding form (which the Java language uses), the most commonly-used code points are below 64K, and the UTF-16 character code simply equals the code point. Higher code points require two code units.
    • UTF-32 is the only fixed-size UTF, which simplifies some processing at the cost of increased space. The code unit always equals the code point.

    Since the Unicode standard restricts legal code points such that all the UTF encodings are able to represent any code point, transcoding among UTFs is guaranteed without loss.

    Unicode code points are often written in documentation as U+hhhh, where hhhh represents 4 hexadecimal digits for the 2-byte code point; for code points above 64K, use this syntax twice for 4 bytes. Note that this syntax is used in documentation, but XML and programming languages have their own syntax for specifying Unicode literals; see the section titled "Characters in code" below for more information.

  • A character encoding scheme tells how to order the bytes in a code unit in the byte stream representing characters. UTF-16 and UTF-32 provide ways to specify the byte ordering, big-endian (default) or little-endian. For UTF-8, a code unit is 1 byte, so there is no ordering — in other words, the ordering is always the same regardless of the endian order of the computer.

Which encoding?

Anything that processes character data needs to know the character set, character encoding form, and character encoding scheme of that data.

External encoding

Outside of XML, text has only external encoding. External encoding is defined by the code page or locale in the environment; the application; the variable type; or an encoding specified in a function or method (such as the Java method String.getBytes(String encoding)).

To determine the database code page under which a database has been created, run:

db2 get db cfg for database-name

and check the value for the "Database code page" parameter.

On Windows, some aspects of encoding depend on the installed version of operating system. The Control Panel applet "Regional and Language Options" lets you change some aspects of your locale and keyboard. You can see what your code page is by opening a command window and typing the command CHCP.

On Linux and UNIX, the active environment is determined from the locale setting, which includes information about language, territory, and code set. To determine the active code page, run:

locale

Internal encoding

XML data can also have internal encoding. Internal encoding is internal to the XML document. There are two kinds:

  • Encoding attribute
  • Byte Order Mark (BOM)

The encoding attribute to the XML declaration, at the top of the document, specifies an official IANA encoding name:

<?xml version="1.0" encoding="UTF-8" ?>

A BOM is a distinctive series of bytes at the very start of the file indicating a Unicode encoding. To read the XML declaration, an XML parser needs to know or guess the encoding. But it can read the BOM unambiguously.

Table 1. Byte Order Mark for Unicode encodings
BOM typeBOM valueEncoding
UTF-8X'EFBBBF'UTF-8
UTF-16 Big EndianX'FEFF'UTF-16
UTF-16 Little EndianX'FFFE'UTF-16
UTF-32 Big EndianX'0000FEFF'UTF-32
UTF-32 Little EndianX'FFFE0000'UTF-32

Inconsistent encoding

If the actual encoding, external encoding, or internal encodings (BOM or XML declaration) of an XML document do not agree, then the document is unreadable. One exception is when the external encoding is Unicode (for example, a Java String in UTF-16): any internal encoding is ignored. A common problem occurs when a process that is not XML aware transcodes (that is, changes actual encoding) or otherwise changes a document without being aware of the internal encoding. Some processing of strings in the Java language, CLI, and embedded SQL applications can transcode without changing internal encoding. See the section below titled "Recommendations" on how to avoid this.

Characters in code

In Windows, to enter text in a different language at your keyboard, you can change your input language and keyboard layout in the Windows Control Panel, under "Regional and Language Options"; click the Details button on the Languages tab. You can add multiple languages and keyboards, and switch among them with a hot key. (Sorry, there's no Unicode "language" or keyboard — it would have to be pretty big!)

The Windows Notepad utility can save files with an Encoding of "ANSI" (which is windows-1252 or similar), "Unicode" (UTF-16 little endian, the endian order of Intel and AMD CPUs), "Unicode big endian" (the default UTF-16), and UTF-8. For the UTFs, Notepad prepends the file with a BOM (but no XML declaration).

In a Java String literal, which is in Unicode, use an escape sequence: \udddd, where dddd represents the 4 hex digits for the Unicode code point.

In XML and HTML, you may use an XML numeric reference (or character reference) for a code point: &#n, where n is a decimal number for the Unicode code point. For example, the decimal reference for the euro currency symbol is &#8364;. You may instead specify the code point as a hexadecimal number by preceding the number with x: &#xn. For example, the euro would be &#x20AC;. A hex numeric reference is often more convenient, as code points are often specified in hex. For numeric references, leading zeros may be omitted. Each character must be escaped or referenced separately: for example, "&#x21;&#x22;" is "!"" (exclamation mark, double quote), while "&#x2122;" is "™" (trademark).

Table 2 illustrates a few sample characters, with their Unicode code point and character code in the UTFs and two legacy Western code pages.

Encoding in the Java language

Java Strings (and identifiers such as class and method names) are encoded in UTF-16; but your source file is assumed to be in your current code page, not UTF-16. When compiling a Java program with javac, the compiler converts any String literals from the encoding of the source file into UTF-16. To specify that the source file is in another encoding, use the javac option -encoding code-set-name. This allows you to use a different encoding in your identifiers and literal strings.

DB2 9 provides a helper class, com.ibm.db2.jcc.DB2Xml, that helps transfer and convert data between the database and the Java application. It is more XML aware; for example, some methods can convert the internal encoding of a serialized XML document along with the actual encoding, so they remain consistent.

Table 3. JDBC getter methods: Return type, encoding, declaration
MethodReturn typeEncodingAdds XML declaration with encoding attribute?
getDB2String()StringUTF-16No
getDB2XmlString()StringISO-10646-UCS-2Yes
getDB2Bytes()byte[ ]UTF-8No
getDB2XmlBytes(String targetEncoding)byte[ ]As specifiedYes
getDB2AsciiStream()InputStreamASCIINo
getDB2XmlAsciiStream()InputStreamASCIIYes
getDB2CharacterStream()java.io.ReaderUTF-16No
getDB2XmlCharacterStream()java.io.ReaderISO-10646-UCS-2Yes
getDB2BinaryStream()InputStreamUTF-8No
getDB2XmlBinaryStream(String targetEncoding)InputStreamAs specifiedYes

DB2 9 comes with 2 JDBC drivers that can connect to DB2 in three different ways (DB2 9 does not support the JDBC Type 3 driver):

  • DB2 Driver for JDBC and SQLJ, a universal JDBC driver, which can connect as:
    • A JDBC Type 4 driver
    • A JDBC Type 2 driver
  • DB2 JDBC Type 2 Driver (a legacy driver)

It is recommended that you use the universal DB2 driver for JDBC and SQLJ. The legacy Type 2 driver has been deprecated since DB2, Version 8.2. The legacy Type 2 JDBC driver does not work with the DB2 class com.ibm.db2.jcc.DB2Xml, and it processes XML differently in some JDBC methods. For example, get methods (for example, ResultSet.getBinaryStream(column), ResultSet.getCharacterStream(column), ResultSet.getString(column)) on a SELECT of an XML column (without explicit XMLSERIALIZE()) add a BOM and XML declaration to serialized XML. Either this internal encoding (UTF-16) does not match the actual encoding, or the BOM is corrupted, making the document unusable.

See Table 4 for more information on JDBC driver versions.

For any type of JDBC connectivity, if you wish to connect as a user other than the current one, you must specify the desired username and password, either in the connection URL or in the Properties argument to getConnection().

DB2 provides sample JDBC applications in IBM\SQLLIB\samples\xml\java\jdbc, such as XmlInsert.java. If you use them to experiment, be aware that by default, they connect to DB2 with the legacy Type 2 driver. (See the helper class Db in the file Util.java.) To connect with the Type 4 driver, you must specify command-line arguments for server, port, user, and password. To see a syntax message, run the class with the argument -help:

prog_name -u2 [dbAlias] [userId passwd] (use universal JDBC type 2 driver)
prog_name [dbAlias] server portNum userId passwd (use universal JDBC type 4 driver)

For example:

java -cp ".;%CLASSPATH%" XmlInsert sample myhost 50000 myuser mypasswd

See the Downloads section for sample code that demonstrates how DB2, JDBC, and the DB2Xml class handle encoding.

Encoding in CLI

When you insert XML into DB2 and bind a variable to an XML column with SQLBindParameter(), CLI determines the encoding of the XML according to the input rules in Table 5 below. If the encoding is determined by application code page, but the document also has an internal encoding (a BOM or XML declaration), the external and internal encodings must match. When you select XML out of DB2 into an ESQL host variable, the data is encoded according to Table 5, with an internal encoding (an XML declaration with the encoding attribute).

Table 5. CLI types
Data typeEncoding assumed on inputEncoding produced on output
SQL_C_CHARApplication's code pageSame as input
SQL_C_DBCHARApplication's code pageSame as input
SQL_C_WCHARUCS-2Same as input
SQL_C_BINARYReads internal encodingUTF-8

Encoding in embedded SQL

When you insert XML into DB2 from an embedded SQL (ESQL) host variable, the encoding of the XML is determined according to the input rules in Table 6. If the encoding is determined by application code page, but the document also has an internal encoding (BOM or XML declaration), the external and internal encodings must match. When you select XML out of DB2 into an ESQL host variable, the data is encoded according to Table 6, with an internal encoding (an XML declaration with the encoding attribute).

Table 6. ESQL types
SQL TYPE IS Encoding assumed on inputEncoding produced on output
XML AS CLOB, XML AS CLOB_FILEApplication's mixed code pageSame as input
XML AS DBCLOB, XML AS DBCLOB_FILEApplication's graphic code pageSame as input
XML AS BLOB, XML AS BLOB_FILEReads internal encoding (Note 1)UTF-8
A simple type like char[]Application's mixed code page (Note 2)Same as input

Note 1: If a BLOB-type variable does not have any internal encoding, DB2 assumes UTF-8.

Note 2: To insert XML from a simple type like char, cast it to XML with the SQL function XMLCAST(? AS XML).

Recommendations

  • Move string literals such as prompts and error messages outside your source code and into some kind of properties or initialization file or into a database table. This will make it easier to translate your application into other languages. You may have to hard code one or two error messages into your applications, in case the application cannot open that message file or select from that message table.
  • Avoid transcoding between non-Unicode character sets, lest you lose characters that are not in the target character set.
  • Test your applications not written in the Java language with different code pages in the application environment. You may not be able to control your users' code page. (The Java application code page is always UTF-16.)
  • Test your application with some non-ASCII data. Code pages 437, 1252, 8859-1, and 1208 (the encoding of UTF-8) all match US-ASCII in the first 128 bytes, so US-ASCII data is less likely to find any problems encoding or transcoding. Check for loss of characters on transcoding, which typically does not produce an error message.
  • Use the universal DB2 Driver for JDBC and SQLJ rather than legacy Type 2 driver. If you cannot control the driver that your application uses, test it with the type or types that will run in production, and avoid those methods that don't work in all types or work differently.
  • For C applications, to process XML in the char[] data type, use the UTF-8 encoding, because it does not contain null (\0) as a valid byte.
  • For CLI and ESQL applications, be aware that CLI returns XML data with internal encoding. If you transcode and change the actual encoding, the document becomes unusable. To avoid this, bind your application variable to the SQL_C_BINARY type; this also keeps the data in UTF, avoiding a lossy transcoding into your local code page.
  • For any data type that will hold UTF characters, allow enough room, because one character may consume more than one byte.

Validation of XML documents

Concept

An XML document is much more flexible in structure than a relational table. A table has a fixed structure, which SELECT statements can rely on. There is a precise number of columns with specific data types. An XML document is almost the complete opposite. It can vary in every detail — node names, depth of the hierarchy, data types, attributes, etc. Query writers need to know about the structure of XML data. To provide orderly data in the XML world, there is the concept of validation provided by XML schemas.

An XML Schema is like a rule book for other XML documents. It provides details about what is allowed in the XML documents — node names, order of nodes, attributes, data types, optional vs. required, etc. Its counterpart in the DB2 relational world is SYSCAT.COLUMNS. XML schemas are themselves XML documents, with very specific rules about their own structure. An XML document that conforms to the rules of an XML Schema is said to be validated or valid.

How to validate

In a database column of type XML, each row contains exactly one complete XML document (or a null value). That document is either validated or not. Validation is performed with the function XMLVALIDATE(). To perform validations, the appropriate XML Schema Documents (XSDs) must first be registered with the database. Then the XMLVALIDATE() function can be used. This function accepts an XML document as input, along with the identity of the appropriate XML Schema, and returns a validated XML document.

Ways to validate XML documents in an XML column (note that LOAD is not included):

  • INSERT statement: Use of XMLVALIDATE()
  • UPDATE statement: Use of XMLVALIDATE()
  • IMPORT statement: Has syntax for validation

A complete example of registering an XML Schema and using XMLVALIDATE() can be found in the article "Get off to a fast start with DB2 Viper" (developerWorks, March 2006).

Testing for validation

An XML column can be tested to determine whether or not it contains a validated XML document with the predicate VALIDATED.

Example 1: Assume that the column XMLCOL is defined in the table T1. Retrieve only the XML values that have been validated by any XML Schema.

   SELECT xmlcol
   FROM
          t1
   WHERE
          xmlcol IS VALIDATED

Example 2: Assume that the column XMLCOL is defined in the table T1. Enforce the rule that values cannot be inserted or updated unless they have been validated.

   ALTER TABLE t1
     ADD CONSTRAINT ck_validated
       CHECK (xmlcol IS VALIDATED)

Note that the constraint checks whether the XML column is validated with any XML Schema. You cannot force validation of a specific schema.

To find out which specific XML Schema was used to validate an XML document, use the XMLXSROBJECTID() function. It returns the key to a row in the table SYSCAT.XSROBJECTS.


XPath and XQuery

XPath and XQuery are already covered in several tutorials. See the following links in the Resources section:

Read the example queries, predict the results, then check your answers by running the query against DB2.


SQL/XML functions

Scalar functions XMLQUERY() and XMLEXISTS()

XMLQUERY() returns an XML value from the evaluation of an XQuery expression possibly using specified input arguments as XQuery variables.

That's straight out of the manual. So what does it mean practically? XMLQUERY() has two parts:

  • An XQuery expression
  • A PASSING clause, where data can be passed into the XQuery expression

A common use of the function is to extract a portion of an XML document. An XML column in a table (if it's not null) contains a complete XML document. In a SELECT list, if you select the XML column name, you get the whole XML document. If you want a subset of the whole document, then you use XMLQUERY(). For the XQuery expression, you may use a simple XPath expression like '$d/path1/path2/path3', or a full FLWOR expression. The $d refers to a variable in the PASSING clause. This clause allows data (columns, expressions, literals) to be passed into the XQuery expression. To extract a portion of an XML document, the XML column name is included in the PASSING clause and given a name, like "d". The XQuery expression then refers to that as $d.

XMLEXISTS() is a predicate used in the WHERE clause of a SELECT statement. Its syntax is similar to XMLQUERY(): it has an XQuery expression and a PASSING clause. As with XMLQUERY(), the XQuery argument may be a simple XPath or full FLWOR expression. If the XQuery expression returns at least one item, then XMLEXISTS() returns TRUE.

XMLEXISTS() is normally used to test whether an XML document (in an XML column) contains something of interest. That something could be the existence of a particular XML node, the existence of an attribute, a specific value for an XML node, or the specific value for an attribute. Multiple conditions can be tested. It is usually more efficient to combine multiple XML conditions into one XMLEXISTS() with XPath and and or, rather than make multiple calls to XMLEXISTS() combined with SQL AND and OR.

Many XML queries can be handled by either XQuery or XMLQUERY(). There are some differences between the two:

  • XQuery returns every item in the result sequence as a separate row. XMLQUERY() is a scalar function and part of a SELECT statement; it returns only one value per call (that is, per XML document passed to it). If there are multiple items in one XML document, XMLQUERY() must be able to return them in one row. It therefore returns a sequence containing all the items from that document.

  • You cannot bind parameters or use host variables in XQuery. You can in XMLQUERY() and XMLEXISTS() with the PASSING clause.

  • In XQuery, an XPath expression (for example, /path1/path2) performs the role of applying predicates ([] notation) and projecting (constructing the output list of elements) all together in one step. In a SELECT statement, the role of applying predicates to rows is handled by XMLEXISTS(); the role of applying predicates to nodes within a row, and of projecting, is handled by XMLQUERY(). Often in an SQL/XML query, the same XPath predicate is specified twice, in XMLEXISTS() and XMLQUERY().

  • To apply predicates to relational columns, XQuery uses the XPath function db2-fn:sqlquery(). A regular SELECT with XMLQUERY() uses its normal WHERE clause, which can mix relational predicates with XMLEXISTS().

  • Only XMLQUERY() can mix relational columns and XML data in the same SELECT list.

See the article "Get off to a fast start with DB2 9 pureXML, Part 3: Query DB2 XML data with SQL" (developerWorks, March 2006), which includes detailed examples of XMLQUERY() and XMLEXISTS() in action.

Scalar functions XMLELEMENT(), XMLATTRIBUTES(), and XMLNAMESPACES()

XMLELEMENT() creates an XML element, by wrapping an XML tag (for example, <xxx> and </xxx>) around something. The something can be any SQL expression (as long as it's not a structured type), including other XML elements. Namespaces and attributes can optionally be included. SQL expressions are automatically converted into their character form, if needed, just as if they were enclosed in the CHAR() function.

XMLATTRIBUTES() creates one or more attributes within an XML element (for example, <xxx a1="zzz">). This function can only be used within the XMLELEMENT() function.

XMLNAMESPACES() creates namespace declarations for an XML element. This can only be used within other functions like XMLELEMENT(), XMLFOREST(), and XMLTABLE().

Example 1: XMLELEMENT(), simple

Listing 4. Example 1 SQL code
SELECT XMLELEMENT( NAME "ColName", colname),
       XMLELEMENT( NAME "Type", typename)
FROM
       syscat.columns
WHERE
       tabschema = 'SYSCAT'
AND    tabname   = 'TABLES'
ORDER BY
       colno
;
Listing 5. Example 1 sample output
<ColName>TABSCHEMA</ColName>        <Type>VARCHAR</Type>
<ColName>TABNAME</ColName>          <Type>VARCHAR</Type>
<ColName>OWNER</ColName>            <Type>VARCHAR</Type>
<ColName>TYPE</ColName>             <Type>CHARACTER</Type>
<ColName>STATUS</ColName>           <Type>CHARACTER</Type>
<ColName>BASE_TABSCHEMA</ColName>   <Type>VARCHAR</Type>
<ColName>BASE_TABNAME</ColName>     <Type>VARCHAR</Type>
<ColName>ROWTYPESCHEMA</ColName>    <Type>VARCHAR</Type>
<ColName>ROWTYPENAME</ColName>      <Type>VARCHAR</Type>
<ColName>CREATE_TIME</ColName>      <Type>TIMESTAMP</Type>
<ColName>INVALIDATE_TIME</ColName>  <Type>TIMESTAMP</Type>
<ColName>STATS_TIME</ColName>       <Type>TIMESTAMP</Type>
<ColName>COLCOUNT</ColName>         <Type>SMALLINT</Type>
<ColName>TABLEID</ColName>          <Type>SMALLINT</Type>
<ColName>TBSPACEID</ColName>        <Type>SMALLINT</Type>

Example 2: XMLELEMENT(), two attributes

Listing 6. Example 2 SQL code
SELECT XMLELEMENT
       (
         NAME "TableName",
         XMLATTRIBUTES
         (
           tableid  AS "TableId",
           colcount AS "NumCols"
         ),
         RTRIM(tabschema) || '.' || tabname
       )
FROM
       syscat.tables
WHERE
       type = 'T'
ORDER BY
       tableid
;
Listing 7. Example 2 sample output
<TableName TableId="2" NumCols="67">SYSIBM.SYSTABLES</TableName>
<TableName TableId="2" NumCols="44">DB2QP.TRACK_QUERY_INFO</TableName>
<TableName TableId="2" NumCols="3">TOOLS.REL_CAT_MEMBER</TableName>
<TableName TableId="3" NumCols="41">SYSIBM.SYSCOLUMNS</TableName>
<TableName TableId="3" NumCols="13">DB2QP.MANAGE_QUERY_INFO</TableName>
<TableName TableId="3" NumCols="9">TOOLS.REL_TYPE</TableName>
<TableName TableId="4" NumCols="57">SYSIBM.SYSINDEXES</TableName>
<TableName TableId="4" NumCols="6">DB2QP.RESULT_INFO</TableName>
<TableName TableId="4" NumCols="5">SYSTOOLS.POLICY</TableName>
<TableName TableId="4" NumCols="8">TOOLS.REL_CAT</TableName>
<TableName TableId="5" NumCols="9">SYSIBM.SYSCOLPROPERTIES</TableName>
<TableName TableId="5" NumCols="23">DB2QP.QP_SYSTEM</TableName>
<TableName TableId="5" NumCols="27">SYSTOOLS.HMON_ATM_INFO</TableName>
<TableName TableId="5" NumCols="5">TOOLS.REL_LINK</TableName>
<TableName TableId="6" NumCols="5">SYSIBM.SYSINDEXCOLUSE</TableName>

Example 3: XMLELEMENT(), with default namespace

Listing 8. Example 3 SQL code
SELECT  XMLELEMENT
        (
          NAME "ColName",
          XMLNAMESPACES
          (
            DEFAULT 'www.ibm.com'
          ),
          colname
        )
FROM
        syscat.columns
WHERE
        tabschema = 'SYSCAT'
AND     tabname   = 'TABLES'
ORDER BY
        colno
;
Listing 9. Example 3 sample output
<ColName xmlns="www.ibm.com">TABSCHEMA</ColName>
<ColName xmlns="www.ibm.com">TABNAME</ColName>
<ColName xmlns="www.ibm.com">OWNER</ColName>
<ColName xmlns="www.ibm.com">TYPE</ColName>
<ColName xmlns="www.ibm.com">STATUS</ColName>
<ColName xmlns="www.ibm.com">BASE_TABSCHEMA</ColName>
<ColName xmlns="www.ibm.com">BASE_TABNAME</ColName>
<ColName xmlns="www.ibm.com">ROWTYPESCHEMA</ColName>
<ColName xmlns="www.ibm.com">ROWTYPENAME</ColName>
<ColName xmlns="www.ibm.com">CREATE_TIME</ColName>
<ColName xmlns="www.ibm.com">INVALIDATE_TIME</ColName>
<ColName xmlns="www.ibm.com">STATS_TIME</ColName>
<ColName xmlns="www.ibm.com">COLCOUNT</ColName>
<ColName xmlns="www.ibm.com">TABLEID</ColName>
<ColName xmlns="www.ibm.com">TBSPACEID</ColName>

Example 4: XMLELEMENT(), with namespace

Listing 10. Example 4 SQL code
SELECT  XMLELEMENT
        (
          NAME "db2:ColName",
          XMLNAMESPACES
          (
            'www.ibm.com/db2' AS "db2"
          ),
          colname
        )
FROM
        syscat.columns
WHERE
        tabschema = 'SYSCAT'
AND     tabname   = 'TABLES'
ORDER BY
        colno
FETCH FIRST 15 ROWS ONLY
;
Listing 11. Example 4 sample output
<db2:ColName xmlns:db2="www.ibm.com/db2">TABSCHEMA</db2:ColName>
<db2:ColName xmlns:db2="www.ibm.com/db2">TABNAME</db2:ColName>
<db2:ColName xmlns:db2="www.ibm.com/db2">OWNER</db2:ColName>
<db2:ColName xmlns:db2="www.ibm.com/db2">TYPE</db2:ColName>
<db2:ColName xmlns:db2="www.ibm.com/db2">STATUS</db2:ColName>
<db2:ColName xmlns:db2="www.ibm.com/db2">BASE_TABSCHEMA</db2:ColName>
<db2:ColName xmlns:db2="www.ibm.com/db2">BASE_TABNAME</db2:ColName>
<db2:ColName xmlns:db2="www.ibm.com/db2">ROWTYPESCHEMA</db2:ColName>
<db2:ColName xmlns:db2="www.ibm.com/db2">ROWTYPENAME</db2:ColName>
<db2:ColName xmlns:db2="www.ibm.com/db2">CREATE_TIME</db2:ColName>
<db2:ColName xmlns:db2="www.ibm.com/db2">INVALIDATE_TIME</db2:ColName>
<db2:ColName xmlns:db2="www.ibm.com/db2">STATS_TIME</db2:ColName>
<db2:ColName xmlns:db2="www.ibm.com/db2">COLCOUNT</db2:ColName>
<db2:ColName xmlns:db2="www.ibm.com/db2">TABLEID</db2:ColName>
<db2:ColName xmlns:db2="www.ibm.com/db2">TBSPACEID</db2:ColName>

Example 5: XMLELEMENT() options for NULLs

Listing 12. Example 5 SQL code
SELECT  XMLELEMENT(NAME "Table", tabname),
        XMLELEMENT(NAME "TbSpc1",
          tbspace
          OPTION EMPTY ON NULL
        ),
        XMLELEMENT(NAME "TbSpc2",
          tbspace
          OPTION NULL ON NULL
        )
FROM
        syscat.tables
WHERE
        tbspace IS NULL
FETCH FIRST 15 ROWS ONLY
;
Listing 13. Example 5 sample output
<Table>CHECK_CONSTRAINTS</Table>        <TbSpc1/>    -
<Table>COLUMNS</Table>                  <TbSpc1/>    -
<Table>COLUMNS_S</Table>                <TbSpc1/>    -
<Table>REFERENTIAL_CONSTRAINTS</Table>  <TbSpc1/>    -
<Table>REF_CONSTRAINTS</Table>          <TbSpc1/>    -
<Table>TABLE_CONSTRAINTS</Table>        <TbSpc1/>    -
<Table>TABLES</Table>                   <TbSpc1/>    -
<Table>TABLES_S</Table>                 <TbSpc1/>    -
<Table>USER_DEFINED_TYPES</Table>       <TbSpc1/>    -
<Table>UDT_S</Table>                    <TbSpc1/>    -
<Table>VIEWS</Table>                    <TbSpc1/>    -
<Table>PARAMETERS</Table>               <TbSpc1/>    -
<Table>PARAMETERS_S</Table>             <TbSpc1/>    -
<Table>ROUTINES</Table>                 <TbSpc1/>    -
<Table>ROUTINES_S</Table>               <TbSpc1/>    -

Scalar function XMLFOREST()

The XMLFOREST() function creates a series of one or more XML elements, concatenated together. The differences between XMLELEMENT() and XMLFOREST() are:

  • XMLELEMENT() creates exactly one element. XMLFOREST() can create any number of elements, concatenated together.
  • The name of the XML element is required syntax for XMLELEMENT(), but can be defaulted for XMLFOREST() for simple SQL expressions. This is demonstrated in the example you'll see in a moment.
  • XMLELEMENT() can create attributes; XMLFOREST() cannot.

XMLFOREST() is like a combination of XMLCONCAT() and XMLELEMENT() (except that XMLFOREST() cannot specify attributes). It's a very convenient way to specify a series of XML elements.

Example 6: XMLFOREST() (no namespaces)

In this example, notice that colno and typename both specify a name for an XML element, while colname does not. Omitting them makes the name the same as the column, but in all uppercase.

Listing 14. Example 6 SQL code
SELECT  XMLFOREST(
          colno    AS "ColNum",
          colname,
          typename AS "DataType"
        )
FROM
        syscat.columns
WHERE
        tabschema = 'SYSCAT'
AND     tabname   = 'TABLES'
ORDER BY
        colno
FETCH FIRST 15 ROWS ONLY
;
Listing 15. Example 6 sample output
<ColNum>0</ColNum><COLNAME>TABSCHEMA</COLNAME><DataType>VARCHAR</DataType>
<ColNum>1</ColNum><COLNAME>TABNAME</COLNAME><DataType>VARCHAR</DataType>
<ColNum>2</ColNum><COLNAME>OWNER</COLNAME><DataType>VARCHAR</DataType>
<ColNum>3</ColNum><COLNAME>TYPE</COLNAME><DataType>CHARACTER</DataType>
<ColNum>4</ColNum><COLNAME>STATUS</COLNAME><DataType>CHARACTER</DataType>
<ColNum>5</ColNum><COLNAME>BASE_TABSCHEMA</COLNAME><DataType>VARCHAR</DataType>
<ColNum>6</ColNum><COLNAME>BASE_TABNAME</COLNAME><DataType>VARCHAR</DataType>
<ColNum>7</ColNum><COLNAME>ROWTYPESCHEMA</COLNAME><DataType>VARCHAR</DataType>
<ColNum>8</ColNum><COLNAME>ROWTYPENAME</COLNAME><DataType>VARCHAR</DataType>
<ColNum>9</ColNum><COLNAME>CREATE_TIME</COLNAME><DataType>TIMESTAMP</DataType>
<ColNum>10</ColNum><COLNAME>INVALIDATE_TIME</COLNAME><DataType>TIMESTAMP</DataType>
<ColNum>11</ColNum><COLNAME>STATS_TIME</COLNAME><DataType>TIMESTAMP</DataType>
<ColNum>12</ColNum><COLNAME>COLCOUNT</COLNAME><DataType>SMALLINT</DataType>
<ColNum>13</ColNum><COLNAME>TABLEID</COLNAME><DataType>SMALLINT</DataType>
<ColNum>14</ColNum><COLNAME>TBSPACEID</COLNAME><DataType>SMALLINT</DataType>

Example 7: XMLFOREST() (with default namespace)

Listing 16. Example 7 SQL code
SELECT  XMLFOREST(
          XMLNAMESPACES(
            DEFAULT 'www.ibm.com'
          ),
          colno    AS "ColNum",
          colname
        )
FROM
        syscat.columns
WHERE
        tabschema = 'SYSCAT'
AND     tabname   = 'TABLES'
ORDER BY
        colno
FETCH FIRST 15 ROWS ONLY
;

Listing 17. Example 7 sample output

Example 8: XMLFOREST() with option EMPTY ON NULL

Listing 18. Example 8 SQL code
SELECT  XMLFOREST(
          tabname AS "Table",
          tbspace AS "TbSpc"
          OPTION EMPTY ON NULL
        )
FROM
        syscat.tables
WHERE
        tbspace IS NULL
FETCH FIRST 15 ROWS ONLY
;
Listing 19. Example 8 sample output
<Table>CHECK_CONSTRAINTS</Table><TbSpc/>
<Table>COLUMNS</Table><TbSpc/>
<Table>COLUMNS_S</Table><TbSpc/>
<Table>REFERENTIAL_CONSTRAINTS</Table><TbSpc/>
<Table>REF_CONSTRAINTS</Table><TbSpc/>
<Table>TABLE_CONSTRAINTS</Table><TbSpc/>
<Table>TABLES</Table><TbSpc/>
<Table>TABLES_S</Table><TbSpc/>
<Table>USER_DEFINED_TYPES</Table><TbSpc/>
<Table>UDT_S</Table><TbSpc/>
<Table>VIEWS</Table><TbSpc/>
<Table>PARAMETERS</Table><TbSpc/>
<Table>PARAMETERS_S</Table><TbSpc/>
<Table>ROUTINES</Table><TbSpc/>
<Table>ROUTINES_S</Table><TbSpc/>

Example 9: XMLFOREST() with option NULL ON NULL

Listing 20. Example 9 SQL code
SELECT  XMLFOREST(
          tabname AS "Table",
          tbspace AS "TbSpc"
          OPTION NULL ON NULL
        )
FROM
        syscat.tables
WHERE
        tbspace IS NULL
FETCH FIRST 15 ROWS ONLY
;
Listing 21. Example 9 sample output
<Table>CHECK_CONSTRAINTS</Table>
<Table>COLUMNS</Table>
<Table>COLUMNS_S</Table>
<Table>REFERENTIAL_CONSTRAINTS</Table>
<Table>REF_CONSTRAINTS</Table>
<Table>TABLE_CONSTRAINTS</Table>
<Table>TABLES</Table>
<Table>TABLES_S</Table>
<Table>USER_DEFINED_TYPES</Table>
<Table>UDT_S</Table>
<Table>VIEWS</Table>
<Table>PARAMETERS</Table>
<Table>PARAMETERS_S</Table>
<Table>ROUTINES</Table>
<Table>ROUTINES_S</Table>

Scalar function XMLCONCAT()

XMLCONCAT() concatenates two or more XML elements together. This even includes the output from functions like XMLCOMMENT(), XMLTEXT(), and XMLPI(). There are at least a couple of situations where XMLCONCAT() is very useful:

  • As input to the function XMLAGG(), which only accepts one argument
  • Construct mixed nodes (for example, John <mi>Q</mi> Smith)

Example 10: XMLCONCAT() with multiple XMLELEMENT()

Listing 22. Example 10 SQL code
SELECT  XMLCONCAT(
          XMLELEMENT(NAME "Name", indname),
          XMLELEMENT(NAME "NumCols", colcount),
          XMLELEMENT(NAME "Uniq", uniquerule)
        )
FROM
        syscat.indexes
FETCH FIRST 15 ROWS ONLY
;
Listing 23. Example 10 sample output
<Name>INDTABLES01</Name><NumCols>2</NumCols><Uniq>U</Uniq>
<Name>INDTABLES02</Name><NumCols>1</NumCols><Uniq>D</Uniq>
<Name>INDTABLES03</Name><NumCols>1</NumCols><Uniq>D</Uniq>
<Name>INDTABLES04</Name><NumCols>1</NumCols><Uniq>D</Uniq>
<Name>INDTABLES05</Name><NumCols>4</NumCols><Uniq>U</Uniq>
<Name>INDTABLES06</Name><NumCols>2</NumCols><Uniq>D</Uniq>
<Name>INDTABLES07</Name><NumCols>2</NumCols><Uniq>D</Uniq>
<Name>INDTABLES08</Name><NumCols>1</NumCols><Uniq>D</Uniq>
<Name>INDCOLUMNS01</Name><NumCols>3</NumCols><Uniq>U</Uniq>
<Name>INDCOLUMNS02</Name><NumCols>2</NumCols><Uniq>D</Uniq>
<Name>INDCOLUMNS03</Name><NumCols>1</NumCols><Uniq>D</Uniq>
<Name>INDINDEXES01</Name><NumCols>2</NumCols><Uniq>U</Uniq>
<Name>INDINDEXES02</Name><NumCols>3</NumCols><Uniq>U</Uniq>
<Name>INDINDEXES03</Name><NumCols>2</NumCols><Uniq>D</Uniq>
<Name>INDINDEXES04</Name><NumCols>3</NumCols><Uniq>D</Uniq>

Example 11: Using XMLCONCAT() to create a mixed node

Listing 24. Example 11 SQL code
SELECT  XMLCONCAT(
          XMLELEMENT(NAME "Index", indname),
          XMLTEXT(' has '),
          XMLELEMENT(NAME "NumCols", colcount)
        )
FROM
        syscat.indexes
FETCH FIRST 15 ROWS ONLY
;
Listing 25. Example 11 sample output
<Index>INDTABLES01</Index> has <NumCols>2</NumCols>
<Index>INDTABLES02</Index> has <NumCols>1</NumCols>
<Index>INDTABLES03</Index> has <NumCols>1</NumCols>
<Index>INDTABLES04</Index> has <NumCols>1</NumCols>
<Index>INDTABLES05</Index> has <NumCols>4</NumCols>
<Index>INDTABLES06</Index> has <NumCols>2</NumCols>
<Index>INDTABLES07</Index> has <NumCols>2</NumCols>
<Index>INDTABLES08</Index> has <NumCols>1</NumCols>
<Index>INDCOLUMNS01</Index> has <NumCols>3</NumCols>
<Index>INDCOLUMNS02</Index> has <NumCols>2</NumCols>
<Index>INDCOLUMNS03</Index> has <NumCols>1</NumCols>
<Index>INDINDEXES01</Index> has <NumCols>2</NumCols>
<Index>INDINDEXES02</Index> has <NumCols>3</NumCols>
<Index>INDINDEXES03</Index> has <NumCols>2</NumCols>
<Index>INDINDEXES04</Index> has <NumCols>3</NumCols>

Scalar function XMLAGG()

XMLAGG() is an aggregate function, just like SUM(), MIN(), and MAX(). It takes one XML expression as an argument, and, for each group of records, concatenates the values all together. There is an optional ORDER BY clause. XMLAGG() is very useful when building up complicated XML documents that involve one-to-many relationships. In the example that follows, each row of SYSCAT.INDEXES is combined with many rows from SYSCAT.INDEXCOLUSE. Numerous, nested one-to-many relationships can be handled by XMLAGG() when converting relational data into XML.

Example 12: XMLAGG()

Listing 26. Example 12 SQL code
WITH
t1 AS
(
  SELECT indschema,
         indname,
         colseq,
         XMLELEMENT(
           NAME "ColName",
           XMLATTRIBUTES(colseq AS "Seq"),
           colname
         ) AS xml_col
  FROM
         syscat.indexcoluse
),
t2 (indschema, indname, list_cols) AS
(
  SELECT indschema,
         indname,
         XMLAGG(xml_col ORDER BY colseq)
  FROM
         t1
  GROUP BY
         indschema,
         indname
)
SELECT  XMLELEMENT(
          NAME "Index",
          XMLATTRIBUTES(i.indname AS "Name"),
          t2.list_cols
        )
FROM
        syscat.indexes  I,
        t2              T2
WHERE
        i.colcount BETWEEN 2 AND 4
AND     t2.indschema = i.indschema
AND     t2.indname   = i.indname
FETCH FIRST 15 ROWS ONLY
;
Listing 27. Example 12 sample output
Row #1:

<Index Name="ATM_UNIQ">
  <ColName Seq="1">SCHEMA</ColName>
  <ColName Seq="2">NAME</ColName><
</Index>

Row #2:

<Index Name="HI_OBJ_UNIQ">
  <ColName Seq="1">HI_ID</ColName>
  <ColName Seq="2">OBJ_NAME1</ColName>
  <ColName Seq="3">OBJ_NAME2</ColName>
  <ColName Seq="4">OBJ_NAME3</ColName>
</Index>

Row #3:

<Index Name="IMDTA00">
  <ColName Seq="1">TOWNER00</ColName>
  <ColName Seq="2">TASKNAME00</ColName>
</Index>

Row #4:

<Index Name="IMDTA03">
  <ColName Seq="1">TASKID00</ColName>
  <ColName Seq="2">SUFFIX00</ColName>
  <ColName Seq="3">MODIFIED00</ColName>
</Index>

Row #5:

<Index Name="IMDTA04">
  <ColName Seq="1">TASKID00</ColName>
  <ColName Seq="2">SUFFIX00</ColName>
</Index>

Notes about Example 12

  • This example can be in coded several different ways. Common table expressions were used to break up the code and hopefully make it clearer. The alternative coding techniques involve nesting many levels of functions within each other and/or the use of subqueries, either of which would have been harder to read.

  • The sample output was reformatted (with indentations and line breaks added) for clarity.

  • This example illustrated just one one-to-many relationship, but numerous one-to-many relationships can be handled.

  • The first CTE (common table expression), T1, obtains the detailed data for the child table (INDEXCOLUSE).

  • The second CTE, T2, aggregates the data from INDEXCOLUSE into a single row per index. It is possible to do the work of T1 and T2 together in one SELECT statement, by nesting XMLELEMENT() within XMLAGG(). That would be manageable, but nesting more than a couple layers of functions quickly becomes hard to read and makes it difficult to line up matching parentheses.

  • The final SELECT joins INDEXES with the aggregated data from INDEXCOLUSE and wraps an XML element around it.

Scalar function XMLDOCUMENT()

XMLDOCUMENT() converts an XML node into an XML document. The XML node must be well formed, which mainly means that it must have a single root element. Tables with columns of type XML always contain well-formed XML documents. The XMLQUERY() function returns XML nodes (not documents). The main situation in which you would need the XMLDOCUMENT() function is when you are inserting the results of XMLQUERY() into a table with an XML column. Example 13 illustrates this.

Example 13: XMLDOCUMENT()

Listing 28. Example 13 SQL code
CREATE TABLE demo.table1
(
  key          INT,
  xml_col      XML
)
;
Listing 29. Example 13 sample output
DB20000I  The SQL command completed successfully.
Listing 30. Example 13 SQL code
INSERT INTO demo.table1 VALUES(1, 
'<person><first>Joe</first><last>Smith</last></person>');
Listing 31. Example 13 sample output
DB20000I  The SQL command completed successfully.
Listing 32. Example 13 SQL code
SELECT  XMLQUERY(
         '$d/person/first'
          PASSING xml_col AS "d"
        )
FROM
        demo.table1
WHERE
        key = 1
;
Listing 33. Example 13 sample output
<first>Joe</first>
Listing 34. Example 13 SQL code
INSERT INTO demo.table1
SELECT  2,
        XMLQUERY(
          '$d/person/first'
          PASSING xml_col AS "d"
        )
FROM
        demo.table1
WHERE
        key = 1
;
Listing 35. Example 13 sample output
DB21034E  The command was processed as an SQL statement because it was not a
valid Command Line Processor command.  During SQL processing it returned:
SQL20345N The XML value is not a well-formed document with a single root
element.  SQLSTATE=2200L
Listing 36. Example 13 SQL code
INSERT INTO demo.table1
SELECT  2,
        XMLDOCUMENT(
          XMLQUERY(
            '$d/person/first'
            PASSING xml_col AS "d"
          )
        )
FROM
        demo.table1
WHERE
        key = 1
;
Listing 37. Example 13 sample output
DB20000I  The SQL command completed successfully.
Listing 38. Example 13 SQL code
SELECT * FROM demo.table1;
Listing 39. Example 13 sample output
KEY  XML_COL
---- -------------------------------
  1  <person><first>Joe</first><last>Smith</last></person>
  2  <first>Joe</first>

Scalar function XMLTEXT()

XMLTEXT() creates a text node. Example 11 above demonstrates how to use it. Its main purpose is to help create mixed type nodes.

Scalar function XMLCOMMENT()

XMLCOMMENT() creates a comment node. Comments are strings with two restrictions:

  • They cannot contain two consecutive dashes (--)
  • They cannot end with a dash (-)

Example 14: XMLCOMMENT() (valid)

Listing 40. Example 14 SQL code
SELECT  XMLCONCAT(
          XMLELEMENT(NAME "Dummy1", 1200),
          XMLCOMMENT('Some comment'),
          XMLELEMENT(NAME "Dummy2", 'hello')
        )
FROM
        sysibm.sysdummy1
;
Listing 41. Example 14 sample output
<Dummy1>1200</Dummy1>
<!--Some comment-->
<Dummy2>hello</Dummy2>

Example 15: XMLCOMMENT() with disallowed consecutive dashes (not valid)

Listing 42. Example 15 SQL code
SELECT  XMLCONCAT(
          XMLELEMENT(NAME "Dummy1", 1200),
          XMLCOMMENT('Bad -- comments'),
          XMLELEMENT(NAME "Dummy2", 'hello')
        )
FROM
        sysibm.sysdummy1
;
Listing 43. Example 15 sample output
SQL20331N The XML comment value "Bad -- comments" is not valid.
SQLSTATE=2200S

Example 16: XMLCOMMENT() with disallowed final dash (not valid)

Listing 44. Example 16 SQL code
SELECT  XMLCONCAT(
          XMLELEMENT(NAME "Dummy1", 1200),
          XMLCOMMENT('Bad comments -'),
          XMLELEMENT(NAME "Dummy2", 'hello')
        )
FROM
        sysibm.sysdummy1
;
Listing 45. Example 16 sample output
SQL20331N The XML comment value "Bad comments -" is not valid.  SQLSTATE=2200S

Scalar function XMLPI()

XMLPI() creates an XML processing instruction. There are a couple of restrictions:

  • The PI name cannot be xml anywhere, in any combination of upper and lower case
  • The string cannot contain ?> anywhere

Example 17: XMLPI() (valid)

Listing 46. Example 17 SQL code
SELECT  XMLCONCAT(
          XMLELEMENT(NAME "Dummy1", 1200),
          XMLPI(NAME "ValidName", 'some string'),
          XMLELEMENT(NAME "Dummy2", 'hello')
        )
FROM
        sysibm.sysdummy1
;
Listing 47. Example 17 sample output
<Dummy1>1200</Dummy1>
<?ValidName some string?>
<Dummy2>hello</Dummy2>

Example 18: XMLPI() with invalid name (not valid)

Listing 48. Example 18 SQL code
SELECT  XMLCONCAT(
          XMLELEMENT(NAME "Dummy1", 1200),
          XMLPI(NAME "xmL", 'some string'),
          XMLELEMENT(NAME "Dummy2", 'hello')
        )
FROM
        sysibm.sysdummy1
;
Listing 49. Example 18 sample output
SQL20275N  The XML name "xmL" is not valid. Reason code = "5".  SQLSTATE=42634

Example 19: XMLPI() with invalid string (not valid)

Listing 50. Example 19 SQL code
SELECT  XMLCONCAT(
          XMLELEMENT(NAME "Dummy1", 1200),
          XMLPI(NAME "Valid", 'bad ?> string'),
          XMLELEMENT(NAME "Dummy2", 'hello')
        )
FROM
        sysibm.sysdummy1
;
Listing 51. Example 19 sample output
SQL20332N The XML processing instruction value "bad ?> string" is not valid.
SQLSTATE=2200T

DBA-related tasks related to XML: Indexes, constraints, and views

The DBA certification tutorial doesn't cover XML indexes, so DBAs may come to you for help indexing XML columns. Here are some tips:

  • See the "Common XML indexing issues" in the DB2 Information Center.

  • Indexes must be at least as broad or inclusive as the XPath in the query that you want to run faster. An XPath that uses the wild cards * or // is more broad than one that specifies the exact path to the desired node.

  • XML indexes may be UNIQUE, which is the only way to enforce a UNIQUE constraint on a field in XML. As with a relational UNIQUE index, uniqueness is enforced across the whole table, not just within one document.

  • DB2 does not support composite XML indexes, but you may be able to work around this by putting the desired elements under a common parent element, and creating an XML index on the parent, like so:

    <name>
        <family>Payne</family>
        <given>Donald</given>
        <middle>Edwin</middle>
    </name>

    An index on the XPath name will index the concatenation of the text in the child elements. Another alternative is the DB2 Net Search Extender; this has been enhanced to fully support XML columns, including proximity and wild card searches.

  • There is no direct way to create a referential integrity constraint on XML data; you may work around this by shredding selected fields to relational columns that have the desired constraint.

  • You may create an XML view on relational data with SQL/XML functions like XMLDOCUMENT(), XMLELEMENT(), etc. You may also create a relational view on XML data, with XMLTABLE(). But it is usually more efficient to query XML data through XQuery or SQL XMLQUERY(), and relational data through SQL SELECT. If you query an XML view on relational data, the query compiler and optimizer cannot take advantage of relational indexes, or vice versa.


DB2 9 enhancements for developers

DB2 9 includes enhancements for XML data:

  • Database Add-Ins for Microsoft Visual Studio® 2005
  • DB2 Developer Workbench (DWB)
  • SQL Assist

The Database Add-Ins for Microsoft Visual Studio 2005 include features for XML:

  • Use an XML data type for columns and procedures
  • Provide an XML index for an XML column
  • Have the capability to visualize XML data
  • Update, import, and export XML data
  • Validate an XML database against a registered XML Schema
  • Register and unregister XML schemas
  • Generate sample data based on an XML Schema
  • Create and register annotated XML schemas
  • Execute and visualize XQuery and SQL/XML scripts
  • Apply XSLT to XML data for customized visualization

See the Resources section for a link to a developerWorks tutorial on DWB and for the Information Online article "What's new for V9.1: IBM Database Add-Ins for Microsoft Visual Studio 2005 enhancements" for details.


DB2 Command Line Processor (CLP) behavior and options

If you run the DB2 Command Line Processor (CLP) to learn about pureXML, be aware of the following CLP behavior and CLP options that change that behavior. This is not likely to be on the exam, but is worth mentioning in this tutorial, since this behavior might mislead you about how DB2 processes XML.

  • The CLP returns XML from a SELECT statement as a 4 KB character column. Short XML output is padded with trailing spaces. The heading and result set can scroll off the screen.

  • The CLP db2 cannot always serialize XML data that it successfully inserts:

    C:\>chcp
    Active code page: 1252

    &#x3a3; is the XML numeric reference for the Greek letter capital sigma. The CLP returns an error when you select this XML document, whether your code page is 1252 or 437, even though sigma is a character in code page 437.

    C:\>db2 -v "insert into catalog values ('Ref3', '<sigma>&#x3a3;</sigma>')"
    insert into catalog values ('Ref3', '<sigma>&#x3a3;</sigma>')
    DB20000I  The SQL command completed successfully.
    
    
    C:\>db2 -x "select catlog from catalog where name = 'Ref3'"
    SQL0969N  There is no message text corresponding to SQL error "-20412" in the
    message file on this workstation.  The error was returned from module
    "SQLABOLB" with original tokens "".  SQLSTATE=2200W
    
    C:\>db2 ? 2200W
    SQLSTATE 2200W: An XML value contained data that could not be serialized.
    
    C:\>chcp 437
    Active code page: 437
    
    C:\>db2 -x "select catlog from catalog where name = 'Ref3'"
    SQL0969N  There is no message text corresponding to SQL error "-20412" in the
    message file on this workstation.  The error was returned from module
    "SQLABOLB" with original tokens "".  SQLSTATE=2200W

    A JDBC application can read this XML. Serializing the XML AS VARCHAR avoids the CLP error, but the sigma is lost, replaced by the DOS end-of-file character, Control-Z, which appears as a box.

    C:\>db2 -x "select xmlserialize(catlog as varchar(90))
    from catalog where name = 'Ref3'"
    </sigma>[box]</sigma>
  • The CLP normally strips any whitespace that contains a new line in incoming string literals, replacing it with a single space. This behavior may mask the processing of whitespace in XML parsing and validation (see the earlier section of this tutorial on that topic for more information). It is not XML specific, so the CLP does this replacement not only to XML boundary whitespace, but to any string of whitespace that contains at least one new line. The CLP does not change whitespace that does not include a new line. Two CLP options, -n and -q, change this behavior.

    To demonstrate, here is an SQL script that inserts an XML document. The text in the XML documents the whitespace following it; for example, following 1 space + NL + 1 space:, there is one space, a new line and one space. In the following listings of code and output, a dollar sign is shown at the end of each line to show where the line ends and where there is whitespace (as in the vi command :%l). A line that must be broken across two lines for readability ends with a backslash (\) and continues on the next line at the left margin. The long string of spaces that pads the XML result to 4 KB is omitted.

    delete from catalog where name = 'foo';$
    insert into catalog (name, catlog) values ($
    'foo',$
    xmlparse(document '<a>1 space + NL + 1 space: $
     3 spaces:   NL:$
    </a>' preserve whitespace)$
    );$
    select catlog from catalog where name = 'foo';$

    If you run this without options -q or -n, the behavior described occurs:

    C:\>db2 -xtvf testWhitespace.sql > testws-no_opts.txt
    delete from catalog where name = 'foo'$
    DB20000I  The SQL command completed successfully.$
    $
    insert into catalog (name, catlog) values ( 'foo', xmlparse(document \
    '<a>1 space + NL + 1 space: 3 spaces:   NL: </a>' preserve whitespace) )$
    DB20000I  The SQL command completed successfully.$
    $
    select catlog from catalog where name = 'foo'$
    <a>1 space + NL + 1 space: 3 spaces:   NL: </a>$
    • One space plus new line plus one space was changed to one space.
    • Three spaces without any new lines was unchanged.
    • One new line was changed to one space.

    If you run the same script with -n, CLP removes new lines but leaves unchanged any spaces adjacent to the new lines:

    C:\>db2 -xntvf testWhitespace.sql > testws-n.txt
    
    delete from catalog where name = 'foo'$
    DB20000I  The SQL command completed successfully.$
    $
    insert into catalog (name, catlog) values ( 'foo', xmlparse(document \
    '<a>1 space + NL + 1 space:  3 spaces:   NL:</a>' preserve whitespace) )$
    DB20000I  The SQL command completed successfully.$
    $
    select catlog from catalog where name = 'foo'$
    <a>1 space + NL + 1 space:  3 spaces:   NL:</a>$
    $
    • One space plus new line plus one space was changed to two spaces.
    • Three spaces without any new line was unchanged, as before.
    • One new line was removed.

    With the -q option (which overrides -n), CLP preserves all whitespace exactly as is:

    C:\>db2 -xqtvf testWhitespace.sql > testws-q.txt
    delete from catalog where name = 'foo'$
    DB20000I  The SQL command completed successfully.$
    $
    insert into catalog (name, catlog) values ( 'foo', xmlparse(document \
    '<a>1 space + NL + 1 space: $
     3 spaces:   NL:$
    </a>' preserve whitespace) )$
    DB20000I  The SQL command completed successfully.$
    $
    select catlog from catalog where name = 'foo'$
    <a>1 space + NL + 1 space: $
     3 spaces:   NL:$
    </a>$
    $

    This is important if you insert "pretty" XML containing new lines, or insert XML with explicit XMLPARSE() with the argument PRESERVE WHITESPACE to preserve boundary whitespace (see the earlier section of this tutorial on that topic for more information).

    This behavior affects statements in the sample script that you can find at SQLLIB\samples\xml\db2sample_xml.db2 in your DB2 install. This script inserts into the CUSTOMER and PRODUCT tables with the clause XMLVALIDATE( XMLPARSE( DOCUMENT '...' PRESERVE WHITESPACE ) ). If you run db2sample_XML.db2 without -q, then the CLP strips most whitespace before it gets to the XML parser.

  • The CLP option -v echoes the command before execution. This is useful to confirm the exact SQL that DB2 executes — for instance, to determine if you escape characters like quote marks on the command line, or use the options -n or -q.

  • The CLP option -i "pretty-prints" the result of an XQuery, with indentation as follows:

    • Each tag and each text node stands alone on its own line
    • Each starting tag lines up vertically with its corresponding ending tag
    • Each element or text node is indented from its parent element

    To accomplish this, CLP adds not only boundary whitespace but whitespace to non-empty elements. This option has no effect on an SQL SELECT: in an ordinary SELECT, CLP returns data in columns with a definite length and must therefore return XML as a regular character string.

  • -x omits column headings, the line of dashes, and the message number record(s) selected, reducing the volume of output. CLP still pads XML with spaces to 4 KB on SELECT.

  • -d retrieves and displays XML declarations in the result. On Windows, the encoding attribute is windows-1252, regardless of your current code page (set with the DOS command CHCP). If you select the XML with XMLSERIALIZE(column INCLUDING XMLDECLARATION), this option has no effect: DB2 generates the declaration (with encoding="UTF-8"), not CLP. The option -d, like the clause INCLUDING XMLDECLARATION, works regardless of whether the original inserted XML had a declaration: the declaration is generated, not retrieved from storage.

  • -td<terminator> specifies an SQL statement terminator character other than the semicolon. This is necessary when running an XQuery containing a semicolon. Depending on the content of your queries, possible terminators include pound sign (#), grave accent (`), and caret (^).

  • You may reduce the size of data that CLP returns from a SELECT by adding an explicit XMLSERIALIZE() or CAST() of the column AS a CHAR or VARCHAR type. Beware that this can change the way in which the document is encoded.


Conclusion

With this tutorial, you've learned how to manipulate XML data in DB2. You should now know:

  • How DB2 handles boundary whitespace on input
  • How DB2 determines the encoding of an XML document on input and output
  • How DB2 can validate XML on input
  • How to execute an XQuery or XPath, and identify its results
  • How to decompose, or shred, XML input to relational rows
  • How to publish relational data as XML
  • How to publish XML data as relational

Good luck with the exam!


Download

DescriptionNameSize
Sample JDBC, SQLdb2CertExamV9AppDevXmlSample.zip72KB

Resources

Learn

Get products and technologies

  • DB2 Express-C: Download a no-charge version of DB2 Express Edition for the community that offers the same core data features as DB2 Express Edition and provides a solid base to build and deploy applications.
  • Application Development: Microsoft Visual C++ Express: Download a free C compiler. To write win32 (as opposed to .NET) applications, you need to download and install Visual C++ 2005 Express Edition with the Microsoft Platform SDK.
  • Application Development: Cygwin: Download a free Linux-like environment for Windows including a shell.
  • Application Development: Vim ("VI Improved"): Download a charity-ware vi-like editor. VIM colors and auto-indents code, including XML, which can make it easier read, write and spot mistakes.
  • Download IBM trial software and get your hands on application development tools and middleware products from DB2®, Lotus®, Rational®, Tivoli®, and WebSphere®.

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Information management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management, XML
ArticleID=202451
ArticleTitle=DB2 9 Application Developer Certification Exam 733 prep, Part 3: XML data manipulation
publish-date=03162007