You are in: RPG Cafe > CCSID(*CHAR) keyword
Short URL: https://ibm.biz/rpgcafe_hspec_ccsid_char
RPG Cafe: CCSID(*CHAR) keyword
CCSID(*CHAR : *JOBRUN)
Do you have this keyword in your H-spec or CTL-OPT copy file that gets copied into all your source?
Or if you don't use a copy file, do you have this keyword in all your source?
You should almost certainly have this keyword. It was introduced officially in V5R3, and through PTFs in V5R1 and V5R2.
Why do you need this keyword?
RPG has a very strange default way of dealing with the CCSID of alphanumeric data. It should be assuming that the alphanumeric data is in the job CCSID, but instead it incorrectly assumes that it is in the mixed SBCS + DBCS (single-byte/double-byte) CCSID related to the job CCSID.
What effect does this odd default behavior have on your programs?
Usually, it has no effect at all. The only time it would make any difference is
- If your job CCSID is one of the single-byte EBCDIC CCSIDs (the CCSIDs in Tables 2, 4, and 6 on this page https://www.ibm.com/support/knowledgecenter/ssw_ibm_i_74/db2/rbafzsidvals.htm)
- Your alphanumeric data contains ax'0E'
- That data is converted to Unicode, either explicitly by using the %CHAR built-in function, or implicitly for a call to a Java™ method or for an assignment or comparison within the RPG program.
So it might not ever matter that RPG is making a false assumption about your alphanumeric data.
But you should use this keyword anyway because coding this option has a zero cost for the performance of your program, and someday it might matter.
Why would you ever want the default behavior?
In the very unlikely event that your RPG program might be running in a job with an SBCS CCSID and it might process data that contains mixed SBCS + DBCS data, then consider changing the job CCSID to reflect the nature of the data that the program is dealing with. But if that is not possible, then do not code this keyword.
What is special about X'0E'?
When EBCDIC data can be both SBCS and DBCS, shift characters are used to indicate a transition between the two types. The default type is SBCS, and the "shift-out" character X'0E' is used to indicate that the type is shifting out of SBCS, and that DBCS data follows. The "shift-in" character X'0F' is used to indicate that the type of data is shifting back into SBCS, and that SBCS data follows. For example, assume that o and i are the shift characters, and that DD and EE are double-byte characters. The string "abcoDDEEiefgh" has
- Three single-byte characters "abc"
- A shift-out character
- Two double-byte characters "DDEE"
- A shift-in character
- Four single-byte characters "efgh"
What exactly happens when single-byte data has X'0E', and you don't have the CCSID(*CHAR:*JOBRUN) keyword?
When EBCDIC data can only be SBCS, the X'0E' and X'0F' characters are not supposed to have any special meaning.
The string above has 13 single-byte characters. When correctly converted to UCS-2, it should have 13 UCS-2 characters.
Incorrectly converted to UCS-2 with the assumption that the data is mixed, it would be incorrectly converted to only 9 characters (3 SBCS "abc" + 2 DBCS "DDEE" + 4 SBCS "efgh").
But I never have x'0E' in my single-byte alphanumeric data!
Well, hardly ever ...
16 December 2019