JR32541: FIX TO THE ORACLE STAGE CODE IS TO TAKE NOTE OF CLIENT CHARACTER SET IN DECIDING SCHEMA

APAR status

Closed as program error.

Error description

Ideally users need to set make sure top level OSH options are
set to suit their client character set and set NLS_LANG
accordingly. If they set some top level OSH options and
do not set NLS_LANG or don't set appropriately,  they are bound
to land in trouble. All Oracle knows about is NLS_LANG. Oracle
uses client character set specified by
NLS_LANG and assumes certain defaults if NLS_LANG is not
specified.  If the default character set assumed by Oracle and
set by user in OSH does not match, it is plain trouble.



Currently we render metadata as ustring whenever database
character set is Multi byte.  Fix to the stage code is to take
note of client character set in deciding schema (see bellow)
Thus if user has no NLS_LANG or one with a single byte character
set, we no longer need to load data in Unicode. Data remains in
client character set and will be handled by Oracle as
appropriate.

Consider the test case where the input data is in UTF-8 already
but OSH top level option does not mention it.  Unfixed, we try
to Unicode data to load and string data input
may fail to convert to UTF-8.  as exporter dosen't know data is
already in UTF-8 and renders bad output. The result could be
longer bytes than column length and hence error.
If the data is in some string character set, the top level
options of OSH are properly set, exporter should be able to do
justice and we shouldn't see any error.

In customer's case the data is corrupt and exporter output is
more gibberish and we got error.

Here this is how the fix relates both client and db character
sets to metadata:

Database character set    Client character set  Schema to use
Explanation
Single byte               Single byte           string
No need to use Unicode.
Single byte               Multi byte            string
Oracle gives single bytes. We can send Unicode though.
Multi byte                Single byte           string
No need to use Unicode.
Multi byte                Multi byte            ustring
read/write multi byte needs ustring.

Of particular interest here is the point, we don't need to use
unicode if client is using a single byte character set.  This is
the central point  of the fix.
This does not alter any thing for those who handle NLS_LANG
correctly.

Local fix

This fix is included in 8.0.1 fix pack 3

Problem summary

****************************************************************
USERS AFFECTED:
all users, particularly those not needing unicode strings.
****************************************************************
PROBLEM DESCRIPTION:
No need to use unicode for char and varchar2 columns if the
client is using a single byte character set.
The metadata should be decided solely on the merit of the
clients character set property whether it is a multibyte one or
not.  The server's character set, whether sinlge byte or
multibyte does not play a role here.
****************************************************************
RECOMMENDATION:
apply patch.
This change is included in 8.1 Fix Pack 1.
****************************************************************

Problem conclusion

Now, the stage will use ustrings for character schema where
characters are potentially multibyte.
This means, NCHAR and NVARCHAR2 columns use ustring schema
type. CHAR and VARCHAR2 will use ustring only if the client
character set is a multibyte one.

Temporary fix

Comments

APAR Information

APAR number
JR32541
Reported component name
WIS DATASTAGE
Reported component ID
5724Q36DS
Reported release
753
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt
Submitted date
2009-03-30
Closed date
2009-05-18
Last modified date
2010-12-09

APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:

Fix information

Fixed component name
WIS DATASTAGE
Fixed component ID
5724Q36DS

Applicable component levels

R753 PSN
UP

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSVSEF","label":"IBM InfoSphere DataStage"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"7.5.3","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
09 December 2010

Tips

JR32541: FIX TO THE ORACLE STAGE CODE IS TO TAKE NOTE OF CLIENT CHARACTER SET IN DECIDING SCHEMA

Subscribe

APAR status

Closed as program error.

Error description

Local fix

Problem summary

Problem conclusion

Temporary fix

Comments

APAR Information

APAR number

Reported component name

Reported component ID

Reported release

Status

PE

HIPER

Special Attention

Submitted date

Closed date

Last modified date

APAR is sysrouted FROM one or more of the following:

APAR is sysrouted TO one or more of the following:

Fix information

Fixed component name

Fixed component ID

Applicable component levels

R753 PSN

Document Information

Share your feedback

Need support?