Encoding scenarios for input of externally encoded XML data to a database
In general, when you use a character application data type, problems with code page conversion do not occur during input to a database.
The following examples demonstrate how external encoding affects data conversion and truncation during input of XML data to an XML column.
Only scenario 1 and scenario 2 apply to Java™ and .NET applications, because the application code page for Java and .NET applications is always Unicode.
Scenario 1
Encoding source | Value |
---|---|
Data encoding | UTF-8 Unicode input data, with or without an appropriate encoding declaration or BOM |
Application data type | Character |
Application code page | 1208 (UTF-8) |
INSERT INTO T1 (XMLCOL) VALUES (?)
INSERT INTO T1 (XMLCOL) VALUES
(XMLPARSE(DOCUMENT CAST(? AS CLOB) PRESERVE WHITESPACE))
INSERT INTO T1 (XMLCOL) VALUES (XMLPARSE(DOCUMENT :HV))
Character conversion: None.
Data loss: None.
Truncation: None.
Scenario 2
Encoding source | Value |
---|---|
Data encoding | UTF-16 Unicode input data, with or without an appropriate encoding declaration or BOM |
Application data type | Graphic |
Application code page | Any SBCS code page or CCSID 1208 |
INSERT INTO T1 (XMLCOL) VALUES (?)
INSERT INTO T1 (XMLCOL) VALUES
(XMLPARSE(DOCUMENT CAST(? AS DBCLOB) PRESERVE WHITESPACE))
INSERT INTO T1 (XMLCOL) VALUES (XMLPARSE(DOCUMENT :HV))
Character conversion: The Db2 database system converts the data from UTF-16 to UTF-8 when it performs the XML parse for storage in the XML column.
Data loss: None.
Truncation: Truncation can occur during conversion from UTF-16 to UTF-8, due to expansion.
Scenario 3
Encoding source | Value |
---|---|
Data encoding | ISO-8859-1 input data, with or without an appropriate encoding declaration |
Application data type | Character |
Application code page | 819 |
INSERT INTO T1 (XMLCOL) VALUES (?)
INSERT INTO T1 (XMLCOL) VALUES
(XMLPARSE(DOCUMENT CAST(? AS CLOB) PRESERVE WHITESPACE))
INSERT INTO T1 (XMLCOL) VALUES (XMLPARSE(DOCUMENT :HV))
Character conversion: The Db2 database system converts the data from CCSID 819 to UTF-8 when it performs the XML parse for storage in the XML column.
Data loss: None.
Truncation: None.
Scenario 4
Encoding source | Value |
---|---|
Data encoding | Shift_JIS input data, with or without an appropriate encoding declaration |
Application data type | Graphic |
Application code page | 943 |
INSERT INTO T1 VALUES (?)
INSERT INTO T1 VALUES
(XMLPARSE(DOCUMENT CAST(? AS DBCLOB)))
INSERT INTO T1 VALUES (XMLPARSE(DOCUMENT :HV))
Character conversion: The Db2 database system converts the data from CCSID 943 to UTF-8 when it performs the XML parse for storage in the XML column.
Data loss: None.
Truncation: None.