Encoding scenarios for input of externally encoded XML data to a database

In general, when you use a character application data type, problems with code page conversion do not occur during input to a database.

The following examples demonstrate how external encoding affects data conversion and truncation during input of XML data to an XML column.

Only scenario 1 and scenario 2 apply to Java™ and .NET applications, because the application code page for Java and .NET applications is always Unicode.

Scenario 1

Encoding source Value
Data encoding UTF-8 Unicode input data, with or without an appropriate encoding declaration or BOM
Application data type Character
Application code page 1208 (UTF-8)
Example input statements:
INSERT INTO T1 (XMLCOL) VALUES (?)
INSERT INTO T1 (XMLCOL) VALUES 
  (XMLPARSE(DOCUMENT CAST(? AS CLOB) PRESERVE WHITESPACE))
INSERT INTO T1 (XMLCOL) VALUES (XMLPARSE(DOCUMENT :HV))

Character conversion: None.

Data loss: None.

Truncation: None.

Scenario 2

Encoding source Value
Data encoding UTF-16 Unicode input data, with or without an appropriate encoding declaration or BOM
Application data type Graphic
Application code page Any SBCS code page or CCSID 1208
Example input statements:
INSERT INTO T1 (XMLCOL) VALUES (?)
INSERT INTO T1 (XMLCOL) VALUES 
  (XMLPARSE(DOCUMENT CAST(? AS DBCLOB) PRESERVE WHITESPACE))
INSERT INTO T1 (XMLCOL) VALUES (XMLPARSE(DOCUMENT :HV))

Character conversion: The Db2 database system converts the data from UTF-16 to UTF-8 when it performs the XML parse for storage in the XML column.

Data loss: None.

Truncation: Truncation can occur during conversion from UTF-16 to UTF-8, due to expansion.

Scenario 3

Encoding source Value
Data encoding ISO-8859-1 input data, with or without an appropriate encoding declaration
Application data type Character
Application code page 819
Example input statements:
INSERT INTO T1 (XMLCOL) VALUES (?)
INSERT INTO T1 (XMLCOL) VALUES 
  (XMLPARSE(DOCUMENT CAST(? AS CLOB) PRESERVE WHITESPACE))
INSERT INTO T1 (XMLCOL) VALUES (XMLPARSE(DOCUMENT :HV))

Character conversion: The Db2 database system converts the data from CCSID 819 to UTF-8 when it performs the XML parse for storage in the XML column.

Data loss: None.

Truncation: None.

Scenario 4

Encoding source Value
Data encoding Shift_JIS input data, with or without an appropriate encoding declaration
Application data type Graphic
Application code page 943
Example input statements:
INSERT INTO T1 VALUES (?)
INSERT INTO T1 VALUES 
  (XMLPARSE(DOCUMENT CAST(? AS DBCLOB)))
INSERT INTO T1 VALUES (XMLPARSE(DOCUMENT :HV))

Character conversion: The Db2 database system converts the data from CCSID 943 to UTF-8 when it performs the XML parse for storage in the XML column.

Data loss: None.

Truncation: None.