Specifying the file code page for Unicode support of table-structured file data sources

To ensure correct code page conversion for table-structured file data sources data sources, you must specify the file code page if the code page differs from the federated database code page.

About this task

Valid values are those that Db2® supports. The default value is the code page of the federated database.

Restrictions

You can use the CODEPAGE option only in a Unicode federated database. If the federated database code page is UTF-8 which is different from the table-structured file character set, specify the CODEPAGE option to the table file character set.

Procedure

To specify the code page of a table-structured file, issue the CREATE NICKNAME statement with the CODEPAGE option set to the code page number of the data in the table-structured file.

Example: The code page of the data in a file named DRUGDATA1.TXT is 943. To specify the code page of a table-structured file as 943, issue the following CREATE NICKNAME statement:
CREATE NICKNAME DRUGDATA1(Dcode Integer NOT NULL, Drug CHAR(20), 
									Manufacutuer CHAR(20))   
       FOR SERVER biochem_lab 
       OPTIONS(FILE_PATH '/usr/pat/DRUGDATA1.TXT',CODEPAGE '943',   
					 COLUMN_DELIMITER '.', 
       SORTED 'Y', KEY_COLUMN 'DCODE', VALIDATE_DATA_FILE 'Y');

Example:To specify the CODEPAGE option, first check the table structure file character set. An error is reported if the CODEPAGE option is not set correctly. Take the file test_data.txt as an example, the charset of test_data.txt is UTF-8.

File: test_data.txt

Content: ã,ABCDf

$ file -i test_data.txt
test_data.txt: text/plain; charset=utf-8

$ xxd test_data.txt
0000000: c3a3 2c41 4243 4466 0a ..,ABCDf.

CREATE NICKNAME Test (Non_Ascii CHAR(10), Ascii VARCHAR(10)) FOR SERVER Sample OPTIONS (FILE_PATH '/home/db2inst1/test_data.txt', COLUMN_DELIMITER ',', SORTED 'N', CODEPAGE '1208')
DB20000I The SQL command completed successfully.

select * from Test

NON_ASCII  ASCII
---------- ----------
ã          ABCDf

1 record(s) selected.


select hex(Non_Ascii), hex(Ascii) from Test

1          2
---------- ----------
C3A32020202020202020 4142434466


1 record(s) selected.

Generate the file test_data_88591.txt by converting charset to iso-8859-1, the hex value of ã becomes 0xe3. Query from nickname with CODEPAGE '1208' gets Invalid column separator. Set CODEPAGE option to the same as the file charset.

$ iconv test_data.txt -f UTF-8 -t iso-8859-1 -o test_data_88591.txt --verbose
test_data.txt:

$ file -i test_data_88591.txt
test_data_88591.txt: text/plain; charset=iso-8859-1

$ xxd test_data_88591.txt
0000000: e32c 4142 4344 660a .,ABCDf.

CREATE NICKNAME Test(Non_Ascii CHAR(50), Ascii VARCHAR(10)) FOR SERVER Sample OPTIONS (FILE_PATH '/home/db2inst1/test_data_88591.txt',COLUMN_DELIMITER ',', SORTED 'N',CODEPAGE '1208')

DB20000I The SQL command completed successfully.


select * from Test

NON_ASCII  ASCII
---------- ----------

SQL1822N Unexpected error code "Data Error" received from data source
"DEPARTMENT". Associated text and tokens are "Invalid Column Delimiter".
SQLSTATE=560BD

Nickname CODEPAGE option is the same as the file charset.

CREATE NICKNAME Test(Non_Ascii CHAR(50), Ascii VARCHAR(10)) FOR SERVER Sample OPTIONS (FILE_PATH '/home/db2inst1/test_data_88591.txt',COLUMN_DELIMITER ',', SORTED 'N',CODEPAGE '819')

DB20000I The SQL command completed successfully.


select * from Test

NON_ASCII  ASCII
---------- ----------
ã          ABCDf

1 record(s) selected.