Generating escaped Unicode data
If you pass Unicode characters to an application or object that is not intended to handle Unicode data, data might be lost unless you escape certain characters. For example, you might need to pass Unicode data through an application that has EBCDIC host variables. Or you might want to store Unicode data in a non-Unicode table.
About this task
You might also want to select Unicode characters from an application that runs on a 3270 terminal emulator, such as SPUFI. If the CCSID setting of the emulator does not include those Unicode characters, those characters do not display properly in the output.
In
these situations, those Unicode characters that cannot be represented
in the encoding scheme of the application or object are lost unless
you escape them. Escaped data is one or more characters
that cannot be represented in the target CCSID and is instead represented
by the encoding value. This representation preserves the data. For
example, the escaped version of the Unicode character д is
\0434. Thus, the following ASCII string contains the escaped character д: 'The
escaped character is \0434'
If you insert escaped data into a Unicode table, Db2 does not interpret your data and modify it to be un-escaped. Escaped data is stored as is in a Db2 table, regardless of whether the table is an ASCII, EBCDIC, or Unicode table.
Procedure
To generate escaped Unicode data:
Examples
- Example: Escaping data
- Assume that T1.C1 contains 'Hi, my name is Андрей '. Notice that the characters in Андрей are all Cyrillic characters, even though some of them do resemble Latin characters. Suppose that you issue the following query in SPUFI:
The result of this query is displayed as follows on a 3270 terminal emulator with the CCSID set to 37:SELECT C1 FROM T1;
Because the characters in Андрей do not exist in CCSID 37, this name is instead displayed as‘Hi, my name is ......'....... To solve this problem, you can add the EBCDIC_STR function, as shown in the following example:
Db2 returns the following output with escaped data:SELECT EBCDIC_STR(C1)FROM T1;
Notice that 0410 is the UTF-16 value for А, 043D is the UTF-16 value for н and so on.‘Hi, my name is \0410\043D\0434\0440\0435\0439' - Example: Un-escaping data
- Assume that T1.C1 contains 'Андрей '. Suppose that you issue the following query:
Db2 interprets this query as follows:SELECT HEX(UNISTR(ASCII_STR(C1))) FROM T1;Thus, the final result of this query is:Table 1. How Db2 interprets query with UNISTR Part of SELECT statements Result Explanation ASCII_STR(C1)\0410\043D\0434\0440\0435\0439Db2 returns the value in C1 (Андрей ) as an ASCII string. Because these characters cannot be represented in ASCII, they are escaped. UNISTR(ASCII_STR(C1))Андрей Db2 then converts the escaped ASCII string to a Unicode UTF-8 string. UTF-8 includes all of the characters, so they no longer have to be escaped. HEX(UNISTR(ASCII_STR(C1)))D090D0BDD0B4D180D0B5D0B9Db2 then returns the hexadecimal value of the UTF-8 string.
Suppose that you issue the following similar query:D090D0BDD0B4D180D0B5D0B9
Db2 interprets this query as follows:SELECT HEX(UNISTR(ASCII_STR(C1),UTF16)) FROM T1;Thus, the final result of this query is:Table 2. How Db2 interprets query with UNISTR and UTF16 parameter Part of SELECT statements Result Explanation ASCII_STR(C1)\0410\043D\0434\0440\0435\0439Db2 returns the value in C1 (Андрей ) as an ASCII string. Because these characters cannot be represented in ASCII, they are escaped. UNISTR(ASCII_STR(C1),UTF16)Андрей Db2 then converts the escaped ASCII string to a Unicode UTF-16 string. UTF_16 includes all of the characters, so they no longer have to be escaped. HEX(UNISTR(ASCII_STR(C1)))D090D0BDD0B4D180D0B5D0B9Db2 then returns the hexadecimal value of the UTF-16 string. 0410043D0434044004350439