Start of change

Encoding scenarios for input of internally encoded XML data to a database

The following examples demonstrate how internal encoding affects data conversion and truncation during input of XML data to an XML column.

In general, use of a binary application data type minimizes code page conversion problems during input to a database.

Scenario 1

Encoding source Value
Data encoding UTF-8 Unicode input data, with or without a UTF-8 BOM or XML encoding declaration
Host variable data type Binary
Host variable declared CCSID Not applicable

Example input statements:

INSERT INTO T1 (XMLCOL) VALUES (?)
INSERT INTO T1 (XMLCOL) VALUES 
  (XMLPARSE(DOCUMENT CAST(? AS BLOB) PRESERVE WHITESPACE))

Character conversion: None.

Data loss: None.

Truncation: None.

Scenario 2

Encoding source Value
Data encoding UTF-16 Unicode input data containing a UTF-16 BOM or XML encoding declaration
Host variable data type Binary
Host variable declared CCSID Not applicable

Example input statements:

INSERT INTO T1 (XMLCOL) VALUES (?)
INSERT INTO T1 (XMLCOL) VALUES 
  (XMLPARSE(DOCUMENT CAST(? AS BLOB) PRESERVE WHITESPACE))

Character conversion: the DB2® database server converts the data from UTF-16 to UTF-8 when it performs the XML parse for storage in a UTF-8 XML column.

Data loss: None.

Truncation: None.

Scenario 3

Encoding source Value
Data encoding ISO-8859-1 input data containing an XML encoding declaration
Host variable data type Binary
Host variable declared CCSID Not applicable

Example input statements:

INSERT INTO T1 (XMLCOL) VALUES (?)
INSERT INTO T1 (XMLCOL) VALUES 
  (XMLPARSE(DOCUMENT CAST(? AS BLOB) PRESERVE WHITESPACE))

Character conversion: The DB2 database system converts the data from CCSID 819 to UTF-8 when it performs the XML parse for storage in a UTF-8 XML column.

Data loss: None.

Truncation: None.

Scenario 4

Encoding source Value
Data encoding Shift_JIS input data containing an XML encoding declaration
Host variable data type Binary
Host variable declared CCSID Not applicable

Example input statements:

INSERT INTO T1 (XMLCOL) VALUES (?)
INSERT INTO T1 (XMLCOL) VALUES 
  (XMLPARSE(DOCUMENT CAST(? AS BLOB) PRESERVE WHITESPACE))

Character conversion: The DB2 database system converts the data from CCSID 943 to UTF-8 when it performs the XML parse for storage in a UTF-8 XML column.

Data loss: None.

Truncation: None.

End of change