Troubleshooting
Problem
When using the CPACKAGE function syntax as: =CPACKAGE(input,"UTF-8) with an output type tree as UTF-8, the output file contains incorrect UTF-8.
Cause
The second argument of the CPACKAGE function is the Character Set code considered as input.
Resolving The Problem
In a situation where an input file and type tree use the Latin5 character set, and the output tree is set to UTF-8 with the objective to convert the input from Latin5 to UTF-8, the following syntax has been incorrectly used:
=CPACKAGE(input,"UTF-8")
This syntax means whatever the input used (latin5 , native or other), the CPACKAGE function will evaluate considering 'input' is UTF-8.
The result will be incorrect.
Solution
If you need to CPACKAGE input from Latin 5 you must use the correct character set:
CPACKAGE(input,"ibm-920_P100-1995")
Note : "ibm" character set are only available sicne version 8.1.x
As your output tree is set to UTF-8, TX will CPACKAGE your input considering it as Latin-5 ( ibm-920_P100-1995 ) and will convert it to UTF-8 when building the output.
Why "ibm-920_P100-1995"
Syntax:
CPACKAGE (single-object-expression , character-set-of-object-content )
In Websphere transformation extender 8.1.x If you search in the help Character setand CPACKAGEyou will find the Character set codes for CPACKAGE, CSERIESTOTEXT, and CTEXT.
This will show you that the Latin5 Data language has "ibm-920_P100-1995" as the Character set code
Product Synonym
Ascential DataStage TX Mercator
Was this topic helpful?
Document Information
Modified date:
16 June 2018
UID
swg21264435