DB2 10.5 for Linux, UNIX, and Windows

Encoding considerations for XML data in JDBC, SQLJ, and .NET applications

Typically, there are fewer XML encoding considerations for Java™ applications than for CLI or embedded SQL applications. Although the encoding considerations for internally encoded XML data are the same for all applications, the situation is simplified for externally encoded data in Java applications because the application code page is always Unicode.

General recommendations for input of XML data in Java applications

  • If the input data is in a file, read the data in as a binary stream (setBinaryStream) so that the database manager processes it as internally encoded data.
  • If the input data is in a Java application variable, your choice of application variable type determines whether the DB2® database manager uses any internal encoding. If you input the data as a character type (for example, setString), the database manager converts the data from UTF-16 (the application code page) to UTF-8 before storing it.

General recommendations for output of XML data in Java applications

  • If you output XML data to a file as non-binary data, you should add XML internal encoding to the output data.

    The encoding for the file system might not be Unicode, so string data can undergo conversion when it is stored in the file. If you write data to a file as binary data, conversion does not occur.

    For Java applications, the database server does not add an explicit declaration for an implicit XML serialize operation. If you cast the output data as the com.ibm.db2.jcc.DB2Xml type, and invoke one of the getDB2Xmlxxx methods, the JDBC driver adds an encoding declaration, as shown in the following table.

    getDB2Xmlxxx Encoding in declaration
    getDB2XmlString ISO-10646-UCS-2
    getDB2XmlBytes(String targetEncoding) Encoding specified by targetEncoding
    getDB2XmlAsciiStream US-ASCII
    getDB2XmlCharacterStream ISO-10646-UCS-2
    getDB2XmlBinaryStream(String targetEncoding) Encoding specified by targetEncoding

    For an explicit XMLSERIALIZE function with INCLUDING XMLDECLARATION, the database server adds encoding, and the JDBC driver does not modify it. The explicit encoding that the database server adds is UTF-8 encoding. Depending on how the value is retrieved by the application, the actual encoding of the data might not match the explicit internal encoding.

  • If the application sends the output data to an XML parser, you should retrieve the data in a binary application variable, with UTF-8, UCS-2, or UTF-16 encoding.