Encoding considerations for XML data in JDBC, SQLJ, and .NET applications
Typically, there are fewer XML encoding considerations for Java™ applications than for CLI or embedded SQL applications. Although the encoding considerations for internally encoded XML data are the same for all applications, the situation is simplified for externally encoded data in Java applications because the application code page is always Unicode.
General recommendations for input of XML data in Java applications
- If the input data is in a file, read the data in as a binary stream
(
setBinaryStream
) so that the database manager processes it as internally encoded data. - If the input data is in a Java application
variable, your choice of application variable type determines whether
the Db2® database
manager uses any internal encoding. If you input the data as a character
type (for example,
setString
), the database manager converts the data from UTF-16 (the application code page) to UTF-8 before storing it.
General recommendations for output of XML data in Java applications
- If you output XML data to a file as non-binary data, you should
add XML internal encoding to the output data.
The encoding for the file system might not be Unicode, so string data can undergo conversion when it is stored in the file. If you write data to a file as binary data, conversion does not occur.
For Java applications, the database server does not add an explicit declaration for an implicit XML serialize operation. If you cast the output data as the
com.ibm.db2.jcc.DB2Xml
type, and invoke one of thegetDB2Xmlxxx
methods, the JDBC driver adds an encoding declaration, as shown in the following table.getDB2Xmlxxx
Encoding in declaration getDB2XmlString
ISO-10646-UCS-2 getDB2XmlBytes(String targetEncoding)
Encoding specified by targetEncoding getDB2XmlAsciiStream
US-ASCII getDB2XmlCharacterStream
ISO-10646-UCS-2 getDB2XmlBinaryStream(String targetEncoding)
Encoding specified by targetEncoding For an explicit XMLSERIALIZE function with INCLUDING XMLDECLARATION, the database server adds encoding, and the JDBC driver does not modify it. The explicit encoding that the database server adds is UTF-8 encoding. Depending on how the value is retrieved by the application, the actual encoding of the data might not match the explicit internal encoding.
- If the application sends the output data to an XML parser, you should retrieve the data in a binary application variable, with UTF-8, UCS-2, or UTF-16 encoding.