Character formats

The Optim™ solutions use the Unicode character set in dialogs and to process data.

DBMS Character sets

Optim supports the following DBMS character sets
Table 1. Oracle - character set support
AL16UTF16 JA16SJIS
AL32UTF8 NEE8ISO8859P4
AR8ISO8859P6 N8PC865
AR8MSWIN1256 TR8MSWIN1254
BLT8MSWIN1257 US7ASCII
CDN8PC863 US8PC437
CL8ISO8859P5 UTF8
CL8MSWIN1251 UTF16
EE8ISO8859P2 VN8MSWIN1258
EE8MSWIN1250 WE8DEC
EL8ISO8859P7 WE8ISO8859P1
EL8MSWIN1253 WE8ISO8859P9
IW8ISO8859P8 WE8ISO8859P15
IW8MSWIN1255 WE8MSWIN1252
Table 2. Sybase ASE - Character Set support
cp437 cp1257
cp850 iso_1
cp1250 iso_2
cp1251 iso_4
cp1252 iso_5
cp1253 iso_6
cp1254 iso_7
cp1255 iso_8
cp1256 iso_9
roman8 UTF16
UTF8  
Table 3. DB2 z/OS - Character Set support
437 865
850 1252
860 UTF8
863 UTF16
Table 4. DB2 Linux, UNIX, Windows - Character Set support
437 964
850 970
860 1252
863 1363
865 1370
936 1383
949 1386
950 UTF8
UTF16  
Table 5. SQL Server - Character Set support
1252
UTF8
UTF16
Table 6. Informix - Character Set support
1252
UTF8

Directories and files

The names of all directories and files referenced by, generated by, or used with the solutions must consist of ASCII characters. This requirement applies to the installation directories for the Optim solutions, as well as the directories used in processing (for example, temporary work directory, data directory, and other directories that are identified in personal and product options or when configuring the server).

All text files generated by the solutions are in Unicode format and you can edit them with a Unicode-compatible text editor such as Microsoft NotePad. the Optim solutions recognize Byte Order Mark headers in externally generated files and the following encodings:

  • UTF-8
  • UTF-16
  • UTF-32
  • ASCII
  • Multi-byte
Note: You cannot compare archive files created before version 6.0 with files created using a more recent version of the Optim Data Growth Solution. You can, however, convert early archive files to compare data in the resulting extract files.

Optim server

Every locale (or its translation) that the server is required to handle must reside on the server machine. In other words, the server must have access to the locale of the delegating workstation. A utility, pr0locl.exe, is provided to tell you the locales that are installed on a machine and the locales with which it is compatible. As an example of the output in a Windows environment, see:

Current operating system: Microsoft Windows XP
C runtime locales are:
   LC_CTYPE    = English_United States.1252
   LC_COLLATE  = English_United States.1252
   LC_NUMERIC  = English_United States.1252
   LC_MONETARY = English_United States.1252
   LC_TIME     = English_United States.1252
Language Environment Variables:
   LC_ALL =
   LANG   = 
Windows Locale is:
   LCID      = 1033 (409)
   Code Page = 1252 (4E4)
RT Server requests can run on or from a UNIX
system that has these locales or their derived locales installed
   C
   en_US.ISO8859-1

Optim directories and DB aliases

The Optim solutions support storing data in single-byte (ASCII), universal character encoding (Unicode), and multi-byte character sets. The default character set is single byte. When you create an Optim directory or DB alias using a database for which Optim supports Unicode or multi-byte characters, you are prompted to indicate the character format used for storing data. To use DB aliases with different character sets, the Optim directory must be in Unicode format. If you indicate that the DB alias for the Optim directory database should share connection information with the directory, the DB alias must use the same character set as the directory. The directory and DB aliases can be configured to support Unicode, if character data in your Unicode-enabled database is kept in Unicode format.

Unicode support

The Unicode character set is supported for Oracle, Sybase ASE, Microsoft SQL Server, DB2® for Linux®, UNIX, andWindows, Informix, and DB2 for z/OS® databases. To process data in a Unicode-enabled database, the Optim directory must also be in a Unicode-enabled database and the directory and DB aliases for Unicode-enabled databases must be flagged during the configuration process.

Oracle

Unicode-enabled Oracle database servers commonly use UTF-8 but may use UTF-16. The Oracle client typically uses a single-byte character set. Using char semantics from Oracle Unicode Servers for char type columns (longer than 500) and varchar2 type columns (longer than 1000) is not supported. To prevent any loss of data, the character set used by the database client must be compatible with the character set of the database server. This requirement is enforced as follows:

Version 8i Oracle clients

For release 8i, the character set for the Oracle client is set in the NLS_LANG environment variable, for example:

  • SET NLS_LANG=AMERICAN_AMERICA.UTF8

Restart the Optim solution, the configuration program, or both after making any changes to the character set.

  1. If the client uses a Unicode character set, the database server must also use a Unicode character set. The Optim directory must reside in a Unicode-enabled database and the directory and DB alias for the database must be configured for Unicode data.
  2. If the database server does not use a Unicode character set, the client cannot use one either. The DB alias for the database must not be configured for Unicode data.
  3. If the database server uses a Unicode character set and the client does not, an error results.

Version 9.0 and later Oracle clients

For releases 9.0 and later, the character set for the Oracle client is set in the NLS_LANG environment variable, for example:

  • SET NLS_LANG=AMERICAN_AMERICA.AL32UTF8

Restart the Optim solution, the configuration program, or both after making any changes to the character set.

Version 9.2 and later Oracle clients.

  1. If the client uses a DB alias configured for Unicode data to connect to a Unicode database, the client character set is automatically set to match the server character set.
  2. If the client uses a DB alias that is not configured for Unicode data to connect to a Unicode database, an error results.
  3. If the client uses a DB alias that is not configured for Unicode data to connect to a non-Unicode database, the client character set is automatically set to match that of the server. (See Character formats for a list of supported character sets.)
  4. If the client uses a DB alias that is configured for Unicode data to connect to a non-Unicode database, an error results.
  5. If the workstation for the Oracle client uses a non-Unicode character set that is not supported, an error results.
  6. If the character set for the database server is not supported, an error results.

Microsoft SQL Server

Because SQL Server does not differentiate on the basis of Unicode characteristics, you need not indicate whether an Optim directory or DB alias is kept in Unicode format. However, the following rules apply:

  1. An Optim directory in an SQL Server database is kept in Unicode format. You must indicate whether any DB aliases for Unicode-supported databases are to be kept in Unicode format.
  2. A DB Alias for an SQL Server database must use the same character format as the Optim directory.

Sybase ASE

To prevent any loss of data, the character set used by the Sybase ASE database client must be compatible with the character set of the database server. This requirement is enforced as follows:

  1. If the client uses a DB alias configured for Unicode data to connect to a Unicode database, the client character set is automatically set to match the server character set.
  2. If the client uses a DB alias that is not configured for Unicode data to connect to a Unicode database, an error results.
  3. If the client uses a DB alias that is configured for Unicode data to connect to a non-Unicode database, an error results.

DB2 for Linux, UNIX, and Windows

To prevent any loss of data, the character set used by the database client must be compatible with the character set of the database server. This requirement is enforced as follows:

  1. All DB aliases for DB2 for Linux, UNIX, and Windows in an Optim directory on DB2 for Linux, UNIX, and Windows must have the same Unicode format as the directory.
  2. If using a DB alias configured for Unicode to connect to a Unicode database, the client character set is automatically set to match the server character set.
  3. If using a DB alias that is not configured for Unicode data to connect to a Unicode database, an error results.
  4. If using a DB Alias that is configured for Unicode data to connect to a non-Unicode database, an error results.
  5. DB aliases for DB2 for Linux, UNIX, and Windows in an Optim directory on Oracle or MS SQL Server can have different Unicode formats; however, Optim cannot connect to both a Unicode-enabled and a non-Unicode-enabled DB2 for Linux, UNIX, and Windows database during the same session.

DB2 for z/OS

To prevent any loss of data, the character set used by the DB2 for z/OS database client must be compatible with the character set of the database server. This requirement is enforced as follows:

  1. All DB aliases for DB2 for z/OS in an Optim directory on DB2 for Linux, UNIX, and Windows must have the same Unicode format as the directory.
  2. If using a DB alias configured for Unicode to connect to a Unicode database, the client character set is automatically set to match the server character set.
  3. If using a DB alias that is not configured for Unicode to connect to a Unicode database, an error results.
  4. If using a DB Alias that is configured for Unicode to connect to a non-Unicode database, an error results.
  5. DB aliases for DB2 for z/OS in an Optim directory on Oracle or MS SQL Server can have different Unicode formats; however, the Optim solutions cannot connect to both a Unicode-enabled and a non-Unicode-enabled DB2 for z/OS database during the same session. If a DB2 for z/OS tablespace includes both Unicode and non-Unicode tables, you must create separate Unicode and non-Unicode DB aliases.

During load processing, you can use only one connection, either Unicode or non-Unicode. You must exit Optim before switching between a Unicode and non-Unicode connection. If the Load Process includes UTF-8 characters in table or column names, the control file is in UTF-8 format. Before transferring a UTF-8 control file to a z/OS machine, the file must be converted to binary format. To browse a UTF-8 control file on a z/OS machine, you must apply IBM® SPE APAR OA07685 - ISPF Browse Support for Unicode to the machine.

Informix

Unicode support is available for Informix®.

Multi-byte support

The Optim directory and DB aliases can be configured to support multi-byte character encoding, if character data in your database is kept in a multi-byte character format. For information about supported multi-byte character sets, see the link for character set support in the Detailed System Requirements document.

To process data in a multi-byte-enabled database, the Optim directory must be in a multi-byte or Unicode-enabled database. The directory and DB aliases for multi-byte-enabled databases must be flagged during the configuration process. A directory in multi-byte format supports multi-byte DB aliases only.

The Optim solutions use the Unicode character set in dialogs and to process information. In some multi-byte character sets (such as Oracle JA16SJIS), multiple characters are mapped to a single Unicode character. When these characters are converted from Unicode back to multi-byte (a round trip), the original character might not be returned. A product option and a personal option determine how round-trip conversion issues are handled when data in a multi-byte database is processed.

Compatible character sets

To prevent any loss of data, the character set used by a database client must be compatible with the character set of the database server. This requirement is enforced as follows:

  1. If using a DB alias configured for multi-byte data to connect to a multi-byte database, the client character set is automatically set to match the server character set.
  2. If using a DB Alias that is not configured for multi-byte data to connect to a multi-byte database, an error results.
  3. When connecting to an Optim directory, the client may establish a connection, check the database character set, drop the connection, and reestablish it with a new language setting.
Note: Because Oracle stores character LOBS in UCS2, which is a 16-byte Unicode format, multi-byte character LOBS cannot be stored correctly in a multi-byte database. For more information, refer to your Oracle documentation.