A fix is available
APAR status
Closed as new function.
Error description
Make Enterprise COBOL Unicode surrogate pair aware, handle 4 byte characters (COBOL RTE)
Local fix
N/A
Problem summary
**************************************************************** * USERS AFFECTED: Users of Enterprise COBOL V6.2 who want to * * use intrinsic functions ULENGTH, UPOS, * * USUBSTR, UWIDTH, and REVERSE to process * * NATIONAL data items and have them be * * aware of surrogate pair characters. * **************************************************************** * PROBLEM DESCRIPTION: New Function: Users need a way to * * process NATIONAL data items even when * * the UTF-16 data contains surrogate pair * * characters. This APAR adds support for * * NATIONAL data items and includes adding * * surrogate pair awareness to functions * * ULENGTH, UPOS USUBSTR, UWIDTH, and * * REVERSE. * **************************************************************** * RECOMMENDATION: Applied provided PTF. * **************************************************************** Customers find that they need to create their own user COBOL libraries if they want to process NATIONAL data items using Enterprise COBOL. This APAR provides added support in intrinsic functions ULENGTH, UPOS, USUBSTR, UWIDTH, and REVERSE for processing NATIONAL data items.
Problem conclusion
Temporary fix
Comments
The compiler was changed to add support for processing NATIONAL data items with intrinsic functions ULENGTH, UPOS, USUBSTR, UWIDTH, and REVERSE. +-------------------------------------------------------------+ | Start of changes for: | | Enterprise COBOL for z/OS Language Reference, SC27-8713-01 | Chapter 21. Intrinsic functions UVALID: Change .character string consists of. to .character data item contains., and remove .Unicode. in 2 places in the description of UVALID: If a character data item contains valid UTF-8 or UTF-16 data, the UVALID function returns the value zero. If a character data item contains invalid UTF-8 or UTF-16 data, the UVALID function returns the index of the first invalid Unicode element. ***Please Insert the examples below after Table 55 in UVALID *** Example 1 If A is an alphabetic or alphanumeric data item that contains value x'4BC3A4666572' in UTF-8 encoding, the returned value from UVALID(A) is zero. Example 2 If B is a national data item that contains value x'005400F6006200750072D858DC6B0073' in UTF-16 encoding, the returned value from UVALID(B) is zero. Example 3 If C is a national data item that contains value x'0054D9C3006200750072D858DC6B0073' in UTF-16 encoding, the returned value from UVALID(C) is two because x'D9C3' does not have a low surrogate pair. Example 4 If D is a national data item that contains value x'005400F60062DC010072D858DC6B0073' in UTF-16 encoding, the returned value from UVALID(B) is four because x'DC01. does not have corresponding high surrogate pair. USUPPLEMENTARY Change .character string argument that is encoded in UTF-8 or UTF-16. to .character data item that contains UTF-8 or UTF-16 data. in the description of USUPPLEMENTARY: The USUPPLEMENTARY function returns an integer value that is equal to the index of the first Unicode supplementary character in a character data item argument that is encoded in UTF-8 or UTF-16. USUBSTR: Change .character string argument that is encoded in UTF-8.. to .character data item that contains UTF-8 or UTF-16 data. the description of USUBSTR: The USUBSTR function returns a substring of the data in a character data item argument that contains UTF-8 or UTF-16 data. Add this line: The function type is alphanumeric or national, depending on the class of argument-1. Add UTF-16 to description of argument-1: argument-1 Must be of class alphabetic, alphanumeric, or national. argument-1 must contain valid UTF-8 or UTF-16 encoded characters: - If argument-1 is of class alphabetic or alphanumeric, it must contain valid UTF-8 data. - If argument-1 is of class national, it must contain valid UTF-16 data. Rewrite the following paragraph to remove character string and add UTF-16: Change: Suppose argument-2 = n and argument-3 = m, the returned value is an alphanumeric character string that contains m UTF-8 characters in argument-1, starting with the nth character. To: Suppose argument-1 is alphabetic or alphanumeric, argument-2 = n and argument-3 = m, the returned value is an alphanumeric item that contains m UTF-8 characters from argument-1, starting with the nth character. Suppose argument-1 is a national data item, argument-2 = n and argument-3 = m, the returned value is a national item that contains m UTF-16 characters from argument-1, starting with the nth character. Add a second example: Example 2 If B is a national item that contains the UTF-16 value x'005400F6006200750072D858DC6B0073', the returned values are as follows: - USUBSTR(B 1 2) returns x'005400F6' - USUBSTR(B 2 1) returns x'00F6' - USUBSTR(B 2 2) returns x'00F60062' - USUBSTR(B 3 2) returns x'00620075' - USUBSTR(B 5 2) returns x'0072D858DC6B' - USUBSTR(B 6 2) returns x'D858DC6B0073' UPOS Change .character string argument that is encoded in. to .character data item that contains. and add UTF-16 to the description of UPOS: The UPOS function returns an integer value that is equal to the index of the nth UTF-8 or UTF-16 character in a character data item argument that contains UTF-8 or UTF-16. argument-1 Must be of class alphabetic, alphanumeric, or national. argument-1 must contain valid UTF-8 or UTF-16 encoded characters. - If argument-1 is of class alphabetic or alphanumeric, it must contain valid UTF-8 data. - If argument-1 is of class national, it must contain valid UTF-16 data. Suppose argument-1 is alphabetic or alphanumeric and argument-2=n, the returned value is the byte position of the nth UTF-8 character in argument-1. Suppose argument-1 is a national data item and argument-2=n, the returned value is the byte position of the nth UTF-16 character in argument-1. If argument-2 is not positive or if argument-2 is larger than ULENGTH(argument-1), zero is returned. Otherwise, if argument-2=n, the returned value is the byte position in argument-1 where the nth UTF-8 or UTF-16 character starts. Add a second example: Example 2 If B is a national data item that contains the UTF-16 value x'005400F6006200750072D858DC6B0073', the returned values are as follows: - UPOS (B 1 ) returns 1 - UPOS (B 2 ) returns 3 - UPOS (B 3 ) returns 5 - UPOS (B 4 ) returns 7 - UPOS (B 5 ) returns 9 - UPOS (B 6 ) returns 11 - UPOS (B 7 ) returns 15 ULENGTH Change .character string argument that is encoded in. to .character data item that contains. and add UTF-16 to the description of ULENGTH: The ULENGTH function returns an integer value that is equal to the number of UTF-8 or UTF-16 characters in a character data item argument that contains UTF-8 or UTF-16 data. argument-1 Must be of class alphabetic, alphanumeric, or national. The returned value is the number of UTF-8 or UTF-16 characters in argument-1. - If argument-1 is of class alphabetic or alphanumeric, it must contain valid UTF-8 data. - If argument-1 is of class national, it must contain valid UTF-16 data. If argument-1 is a national data item that contains UTF-16 data and argument-1 contains surrogate pairs, each pair of low and high surrogates will be counted as one UTF-16 character. For example, if B is a national item that contains the UTF-16 value x'005400F6006200750072D858DC6B0073', the returned value from ULENGTH(B) will be 7. Character X' D858DC6B. is counted as 1 UTF-16 character. UWIDTH Change .character string argument that is encoded in. to .character data item that contains. and add UTF-16 to the description of UWIDTH: The UWIDTH function returns an integer value that is equal to the width in bytes of the nth UTF-8 or UTF-16 character in a character data item argument that is encoded in UTF-8 or UTF-16. argument-1 Must be of class alphabetic, alphanumeric, or national. - If argument-1 is of class alphabetic or alphanumeric, it must contain valid UTF-8 data. - If argument-1 is of class national, it must contain valid UTF-16 data. argument-2 Must be an integer. If argument-2 is not positive or if argument-2 is larger than ULENGTH(argument-1), zero is returned. Otherwise, if argument-2=n, the returned value is the width in bytes of the nth UTF-8 or UTF-16 character in argument-1. The returned value is an integer For example, if B is a national data item that contains the UTF-16 value x'005400F6006200750072D858DC6B0073', the returned values are as follows: - UWIDTH (B 1) returns 2 - UWIDTH (B 2) returns 2 - UWIDTH (B 2) returns 2 - UWIDTH (B 3) returns 2 - UWIDTH (B 4) returns 2 - UWIDTH (B 5) returns 2 - UWIDTH (B 6) returns 4 - UWIDTH (B 7) returns 2 REVERSE Change .character string. to .character value. in the description of REVERSE, and add national and UTF-16 details: The REVERSE function returns a character value of the same length as the argument, whose characters are the same as those specified in the argument except that they are in reverse order. For arguments of type national, character positions are reversed; UTF-16 characters that are surrogate pairs are treated as one character and UTF-16 characters that are not surrogate pairs are treated as one character. argument-1 Must be class alphabetic, alphanumeric, or national and must be at least one character in length. - If argument-1 is of class alphabetic or alphanumeric, it must contain valid UTF-8 data. - If argument-1 is of class national, it must contain valid UTF-16 data. Add this example: Example 1 If argument-1 is an alphanumeric data item that contains the UTF-8 value x'4BC3A4666572', the returned value is x'726566C3A44B' Example 2 If argument-1 is a national data item that contains the UTF-16 value x'0054 00F6 D847DDF3 0062 0075 0072 D858DC6B 0073', the returned value is x'0073 D858DC6B 0072 0075 0062 D847DDF3 00F6 0054. | End of changes for: | | Enterprise COBOL for z/OS Language Reference, SC27-8713-01 | +-------------------------------------------------------------+
APAR Information
APAR number
PI97434
Reported component name
ENT COBOL FOR Z
Reported component ID
5655EC600
Reported release
620
Status
CLOSED UR1
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2018-05-01
Closed date
2018-05-28
Last modified date
2019-06-04
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Modules/Macros
IGY8RWTU IGYCASMB IGYCCBE IGYCCCRT IGYCCICS IGYCCSRV IGYCDGEN IGYCDIAG IGYCDMAP IGYCDOPT IGYCEN$0 IGYCEN$1 IGYCEN$2 IGYCEN$3 IGYCEN$4 IGYCEN$5 IGYCEN$8 IGYCEN$D IGYCEN$R IGYCFGEN IGYCFREE IGYCINIT IGYCJA$0 IGYCJA$1 IGYCJA$2 IGYCJA$3 IGYCJA$4 IGYCJA$5 IGYCJA$8 IGYCJA$D IGYCJA$R IGYCLIBH IGYCLIBO IGYCLIBR IGYCLSTR IGYCLVL0 IGYCLVL1 IGYCLVL2 IGYCLVL3 IGYCLVL8 IGYCMALL IGYCOB2 IGYCOPI IGYCOPT IGYCOSCN IGYCPGEN IGYCRCTL IGYCRDPR IGYCRDSC IGYCREAL IGYCRWT IGYCSCAN IGYCSIMD IGYCUE$0 IGYCUE$1 IGYCUE$2 IGYCUE$3 IGYCUE$4 IGYCUE$5 IGYCUE$8 IGYCUE$D IGYCUE$R IGYCXREF IGYDRV IGYEQCWI IGYMSGE IGYMSGK IGYMSGT IGYQCBE IGYZQDRV IGYZQENU IGYZQJPN
SC27871301 |
Fix information
Fixed component name
ENT COBOL FOR Z
Fixed component ID
5655EC600
Applicable component levels
R620 PSY UI56120
UP18/06/01 P F805
R621 PSY UI56121
UP18/06/01 P F805
R622 PSY UI56122
UP18/06/01 P F805
R62H PSY UI56123
UP18/06/01 P F805
Fix is available
Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SS6SG3","label":"Enterprise COBOL for z\/OS"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"620","Edition":"","Line of Business":{"code":"LOB35","label":"Mainframe SW"}}]
Document Information
Modified date:
12 December 2023