DB2 Version 10.1 for Linux, UNIX, and Windows

Code page considerations for the ingest utility

When the ingest utility processes input data, there are three code pages involved: the application (client) code page, the input data code page, and the database code page.

Code page How specified Default
Application (client) code page, which is used in the CLP command file Determined from the current locale Determined from the current locale
Input data code page INPUT CODEPAGE on the INGEST command Application code page
Database code page Specified on the CREATE DATABASE command 1208 (UTF-8 encoding of Unicode)

If the input data code page differs from the application code page, the ingest utility temporarily overrides the application code page with the input data code page so that DB2® converts the data directly from the input data code page to the database code page. Under some conditions, the ingest utility cannot override the application code page. In this case, the ingest utility converts character data that is not defined as FOR BIT DATA to the application code page before passing it to DB2. In all cases, if the column is not defined as FOR BIT DATA, DB2 converts the data to the database code page.

CLP command file code page
Except for hex constants, the ingest utility assumes that the text of the INGEST command is in the application code page. Whenever the ingest utility needs to compare strings specified on the INGEST command (for example, when comparing the DEFAULTIF character to a character in the input data), the ingest utility performs any necessary code page conversion to ensure the compared strings are in the same code page. Neither the ingest utility nor DB2 do any conversion of hex constants.
Input data code page
If both a field and the table column that it is assigned to are defined as FOR BIT DATA, then neither the ingest utility nor DB2 does any code page conversion. For example, suppose that the INGEST command assigns field $c1 to column C1 and both are defined as CHAR FOR BIT DATA. If the input field contains X'E9', then DB2 sets column C1 to X'E9', regardless of the input data code page or database code page.

It is strongly recommended that if a column definition omits FOR BIT DATA, then its corresponding field definition also omit FOR BIT DATA. Likewise, if a column definition specifies FOR BIT DATA, its corresponding field should also specify FOR BIT DATA. Otherwise, the value assigned to the column is unpredictable because it depends on whether the ingest utility can override the application code page.

The following example illustrates this situation:
  • The input data code page is 819.
  • The application code page is 850.
  • The database code page is 1208 (UTF-8).
  • The input data is "é" ("e" with an acute accent), which is X'E9' in code page 819, X'82' in code page 850, and X'C3A9' in UTF-8.
The following table shows what data ends up on the server depending on whether the field and/or column are defined as FOR BIT DATA and whether the ingest utility can override the application code page:
Table 1. Possible outcomes if the field and column definitions are defined as FOR BIT DATA
Field definition Column definition  Input data (code page 819 ) Data after the ingest utility converts it to application code page 850 Data on the server if the ingest utility can override the application code page Data on the server if the ingest utility cannot override the application code page
CHAR CHAR X'E9' X'82' X'C3A9' X'C3A9'
CHAR FOR BIT DATA CHAR FOR BIT DATA X'E9' X'E9' X'E9' X'E9'
CHAR FOR BIT DATA CHAR X'E9' X'E9' X'C3A9' X'C39A' ("Ú")
CHAR CHAR FOR BIT DATA X'E9' X'82' X'E9' X'82'
The data in the fourth column is what the ingest utility sends to DB2 when it can override the application code page. The data in the fourth column is what the ingest utility sends when it cannot override the application code page. Note that when the FOR BIT DATA attribute of the field and column definitions are different, the results can vary as shown in the preceding table.
Code page errors
In cases where the input code page, application code page, or database code page differ, either the ingest utility or DB2 or both will perform code page conversion. If DB2 does not support the code page conversion in any of the following cases, the ingest utility issues an error and the command ends.
Conversion is required when... In this case, conversion from... To... Is done by...
The INGEST command contains strings or SQL identifiers that need to be converted to the input data code page. Application code page Input data code page Ingest utility
The utility can override the application code page to be the input data code page. Input code page Database code page DB2
The utility cannot override the application code page. Input code page Application code page Ingest utility
The utility cannot override the application code page. Application code page Database code page DB2