XML is a unique data type in SQL. Understanding its rules is important in getting it right in applications. Luckily, many of the rules are straight-forward. I will cover the following three aspects here:
- XML comparison rules;
- XML assignment rules;
- XML data encoding.
XML comparison rules
The SQL XML type cannot be compared with any other type, including the XML type itself, using SQL comparison operators. XML can only be tested using one of the following two predicates:
- xml-expr IS NULL, or xml-expr IS NOT NULL
- theXMLEXISTS predicate, you can use XPath operators within XMLEXISTS.
If you need to compare an XML atomic value extracted from an XML document using XPath with an SQL value, you need to use XMLCAST to cast to an SQL type. XMLCAST can be used to cast a value from XML to SQL, or from SQL to XML. Casting from XML to SQL, such as XMLCAST(XMLQUERY('/purchaseOrder/@PONum' PASSING X) as INTEGER), is its main usage, as casting from SQL to XML is done by DB2 implicitly in most cases, such as in parameter passing of XMLQUERY, XMLEXISTS, and XMLTABLE.
XML assignment rules
In DB2 9, XML is not compatible with any other type for assignment. For convenience, an XML column can be assigned a string by implicit XMLPARSE. In addition, the assignment rules for bind-in and bind-out of XML data are quite flexible. XML assignment rules should be considered with two directions: storage assignment and retrieval assignment, similar to string assignment. Here is a summary of the XML assignment rules:
- XML host variables are supported in many host languages (seeapplication programming language support for XML for details). The XML host variables can only be assigned to XML columns and only accept XML values from SQL. That is, XML host variable usage is very restrictive.
- Any host variable of string types (character or binary: CHAR, VARCHAR, GRAPHIC, VARBINARY, BLOB, etc.) can be assigned to XML columns, by implicit XMLPARSE. For example, INSERT INTO MYTABLE VALUES (1, :VARCHARhv), where the second column is of XML type.
- Any host variable of string types (character or binary, except LOB locators) can accept XML values, by implicit XMLSERIALIZE. For example, SELECT XMLELEMENT(NAME "test", 'a') INTO :VARBINhv FROM SYSIBM.SYSDUMMYU;
In other words, you can hold XML values using non-XML string host variables, including file reference variables. Therefore, you can get XML data in and out without using XML host variables in languages like COBOL, PL/I, C/C++, etc. For example, you can simply fetch a SELECT result containing an XML value into a VARCHAR host variable. You can use regular Java interface for XML data also.
One thing that XML is different from other string types is that truncation is serious so it will result in an error (SQLCODE -433), instead of a warning. To avoid truncation, use a large enough buffer, or use FETCH WITH CONTINUE.
You would need the explicit conversion, i.e. XMLPARSE(), from string to XML in cases where the XML data held in a non-XML host var from an application is not directly into an XML column, or the XML data are in a non-XML column. Here are some simple examples for illustration:
INSERT INTO MYTABLE (C1, C2)SELECT XT.*FROM XMLTABLE('row-expr' PASSING XMLPARSE(DOCUMENT CAST(? AS BLOB) ) COLUMNS C1 VARCHAR(10) PATH 'column-expr1', C2 XML PATH 'column-expr2') XT;
UPDATE MYTABLE SET XMLC = XMLPARSE(DOCUMENT VC1);
-- this one shows the value of XML host varINSERT INTO MYTABLE (C1, C2)SELECT XT.*FROM XMLTABLE('row-expr' PASSING :xmlhv COLUMNS C1 VARCHAR(10) PATH 'column-expr1', C2 XML PATH 'column-expr2') XT;
On the other hand, you would need conversion, explicit XMLSERIALIZE(), from XML to string (a LOB) in cases where an XML value is not to bind out to a host var. For example, to insert XML data into a BLOB column:
INSERT INTO MYTABLE2SELECT C1, XMLSERIALIZE(XMLC as BLOB(512K))FROM MYTABLE;
XML data encoding
There are two ways for DB2 to recognize XML data encoding:
- External encoding: if you hold XML data in a character string host var or column, the CCSID associated with the host var or column determines encoding of XML data. DB2 ignores encoding information contained in the XML data if it exists.
- Internal encoding: if you hold XML data in a binary string host var or column, there is no CCSID, and DB2 determines encoding based on the information (BOM - Byte Order Marker, or encoding declaration) contained within the XML data. By default, the encoding is UTF-8.
DB2 uses UTF-8 encoding to process XML data internally, and stores character data in UTF-8. It will convert other encodings into UTF-8 for processing. To avoid any encoding conversion overhead, try to use UTF-8 for XML data whenever possible. For more XML encoding details, see XML data encoding.
Again, if you have any questions, please post to the DB2 pureXML Forum, or contact us at db2zxml at us.ibm.com.
-Guogen (Gene) Zhang (GGZ)