Optim™ processes
Double-Byte Character Set (DBCS) data and provides DBCS users with
all Optim capabilities that
are available to Extended Binary Coded Decimal Interchange Code (EBCDIC)
users.
Optim DBCS
support assumes EBCDIC DBCS data. EBCDIC DBCS requires exactly two
bytes for every DBCS character. Optim DBCS
support assumes a single DBCS language, including pure DBCS, and Mixed
Single-Byte Character Set (SBCS) and DBCS.
In
mixed character strings (strings that contain both SBCS and DBCS characters),
two special control characters indicate the start and end of a DBCS
substring. Shift Out (X'0E') indicates the start and Shift
In (X'0F') indicates the end of the DBCS substring.
DB2® translates
all DB2 character data between
the internal DB2 table Coded
Character Set Identifier (CCSID) and the external application (Optim) CCSID. All data that Optim processes remains in the
external Optim CCSID.
Note: To eliminate problems with round-trip character translation,
the Optim installer should
ensure that the external Optim CCSIDs
match the internal DB2 CCSIDs
for DBCS data.
Optim Archive
and Extract Files preserve the encoding scheme and CCSID. Optim warns users if incompatible CCSIDs could
cause problems in storing data. Site and User options (“Allow Mismatched
CCSIDs”) indicate the action Optim should
take when the CCSID of a source column does not match that of a target
column and when the CCSID of the terminal does not match that of the DB2 subsystem.
DBCS Data Types
Optim supports the following DBCS
data types. The conversion and mapping rules for the DBCS data types
are the same as the corresponding DB2 rules.
- GRAPHIC - A fixed-length data type used to
store a graphic string, that is, a string consisting of double-byte
EBCDIC characters that are not stored with Shift Out and Shift In
characters. The maximum length of a GRAPHIC string is 254 bytes.
- VARGRAPHIC/LONGVARGRAPHIC - A varying-length
data type used to store a variable-length graphic string. The maximum
length of a VARGRAPHIC/LONGVARGRAPHIC string is 32704 bytes.
- CHAR or CLOB defined with “FOR MIXED DATA” -
Used to store mixed data, that is, Multi-byte Character Set (MBCS)
data (the MIXED=YES DB2 install
option is required). Shift Out and Shift In characters are required
to identify the double-byte data in a mixed character string.
- DBCLOB - A double-byte character large object.
The maximum length of a DBCLOB is 1,073,741,823 DBCS characters.
Functions that Support DBCS Data
Optim supports
DBCS data in the following functional areas:
Definitions
The
following definitions explain terms used in this section:
- Coded Character Set Identifier (CCSID)
- A 16-bit number that uniquely identifies a coded representation
of graphic characters. It designates an encoding scheme identifier
and one or more pairs that consist of a character set identifier and
an associated code page identifier.
- Double-Byte Character Set (DBCS)
- A set of characters, which are used by national languages such
as Japanese and Chinese, that have more symbols than can be represented
by a single byte. Each character is 2 bytes in length.
- Graphic String
- A string consisting of double-byte EBCDIC characters that are
not stored with Shift Out and Shift In characters.
- Multi-Byte Character Set (MBCS)
- A character set that represents single characters with more than
a single byte. UTF-8 is an example of an MBCS. Characters in UTF-8
can range from 1 to 4 bytes in DB2.
- Single-Byte Character Set (SBCS)
- A set of characters in which each character is represented by
a single byte.