Encoding scenarios for input of internally encoded XML data to a database

Examples demonstrate how internal encoding affects data conversion and truncation during input of XML data to an XML column.

In general, use of a binary application data type minimizes code page conversion problems during input to a database.

Scenario 1

Encoding source Value
Data encoding UTF-8 Unicode input data, with or without a UTF-8 BOM or XML encoding declaration
Application data type Binary
Application code page Not applicable
Example input statements:
INSERT INTO T1 (XMLCOL) VALUES (?)
INSERT INTO T1 (XMLCOL) VALUES 
  (XMLPARSE(DOCUMENT CAST(? AS BLOB) PRESERVE WHITESPACE))

Character conversion: None.

Data loss: None.

Truncation: None.

Scenario 2

Encoding source Value
Data encoding UTF-16 Unicode input data containing a UTF-16 BOM or XML encoding declaration
Application data type Binary
Application code page Not applicable
Example input statements:
INSERT INTO T1 (XMLCOL) VALUES (?)
INSERT INTO T1 (XMLCOL) VALUES 
  (XMLPARSE(DOCUMENT CAST(? AS BLOB) PRESERVE WHITESPACE))

Character conversion: the Db2® database server converts the data from UTF-16 to UTF-8 when it performs the XML parse for storage in the XML column.

Data loss: None.

Truncation: None.

Scenario 3

Encoding source Value
Data encoding ISO-8859-1 input data containing an XML encoding declaration
Application data type Binary
Application code page Not applicable
Example input statements:
INSERT INTO T1 (XMLCOL) VALUES (?)
INSERT INTO T1 (XMLCOL) VALUES 
  (XMLPARSE(DOCUMENT CAST(? AS BLOB) PRESERVE WHITESPACE))

Character conversion: The Db2 database system converts the data from CCSID 819 to UTF-8 when it performs the XML parse for storage in the XML column.

Data loss: None.

Truncation: None.

Scenario 4

Encoding source Value
Data encoding Shift_JIS input data containing an XML encoding declaration
Application data type Binary
Application code page Not applicable
Example input statements:
INSERT INTO T1 (XMLCOL) VALUES (?)
INSERT INTO T1 (XMLCOL) VALUES 
  (XMLPARSE(DOCUMENT CAST(? AS BLOB) PRESERVE WHITESPACE))

Character conversion: The Db2 database system converts the data from CCSID 943 to UTF-8 when it performs the XML parse for storage in the XML column.

Data loss: None.

Truncation: None.