IBM Support

PI97434: MAKE ENTERPRISE COBOL UNICODE SURROGATE PAIR AWARE, HANDLE 4 BYTE CHARACTERS (COBOL RTE)

A fix is available

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as new function.

Error description

  • Make Enterprise COBOL Unicode surrogate pair aware, handle 4
    byte characters (COBOL RTE)
    

Local fix

  • N/A
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED: Users of Enterprise COBOL V6.2 who want to   *
    *                 use intrinsic functions ULENGTH, UPOS,       *
    *                 USUBSTR, UWIDTH, and REVERSE to process      *
    *                 NATIONAL data items and have them be         *
    *                 aware of surrogate pair characters.          *
    ****************************************************************
    * PROBLEM DESCRIPTION: New Function: Users need a way to       *
    *                      process NATIONAL data items even when   *
    *                      the UTF-16 data contains surrogate pair *
    *                      characters. This APAR adds support for  *
    *                      NATIONAL data items and includes adding *
    *                      surrogate pair awareness to functions   *
    *                      ULENGTH, UPOS USUBSTR, UWIDTH, and      *
    *                      REVERSE.                                *
    ****************************************************************
    * RECOMMENDATION: Applied provided PTF.                        *
    ****************************************************************
    Customers find that they need to create their own user COBOL
    libraries if they want to process NATIONAL data items using
    Enterprise COBOL. This APAR provides added support in intrinsic
    functions ULENGTH, UPOS, USUBSTR, UWIDTH, and REVERSE for
    processing NATIONAL data items.
    

Problem conclusion

Temporary fix

Comments

  • The compiler was changed to add support for processing NATIONAL
    data items with intrinsic functions ULENGTH, UPOS, USUBSTR,
    UWIDTH, and REVERSE.
    +-------------------------------------------------------------+
    | Start of changes for:                                       |
    | Enterprise COBOL for z/OS Language Reference, SC27-8713-01  |
    Chapter 21. Intrinsic functions
    UVALID:
    Change .character string consists of. to
    .character data item contains., and remove .Unicode.  in 2
    places in the description of UVALID:
    If a character data item contains valid UTF-8 or UTF-16 data,
    the UVALID function returns the value zero. If a character
    data item contains invalid UTF-8 or UTF-16  data, the UVALID
    function returns the index of the first invalid Unicode
    element.
    ***Please Insert the examples below after Table 55 in
    UVALID  ***
    Example 1
    If A is an alphabetic or alphanumeric data item that
    contains value x'4BC3A4666572' in UTF-8 encoding, the returned
     value from UVALID(A) is zero.
    Example 2
    If B is a national data item that contains value
    x'005400F6006200750072D858DC6B0073' in UTF-16 encoding, the
    returned value from UVALID(B) is zero.
    Example 3
    If C is a national data item that contains value
    x'0054D9C3006200750072D858DC6B0073' in UTF-16 encoding, the
    returned value from UVALID(C) is two because x'D9C3'
    does not have a low surrogate pair.
    Example 4
    If D is a national data item that contains value
    x'005400F60062DC010072D858DC6B0073' in UTF-16 encoding,
    the returned value from UVALID(B) is four because x'DC01.
    does not have corresponding high surrogate pair.
    USUPPLEMENTARY
    Change .character string argument that is encoded in UTF-8 or
    UTF-16. to .character data item that contains UTF-8 or UTF-16
    data. in the description of USUPPLEMENTARY:
    The USUPPLEMENTARY function returns an integer value that is
    equal to the index of the first Unicode supplementary
    character in a character data item argument
    that is encoded in UTF-8 or UTF-16.
    USUBSTR:
    Change .character string argument that is encoded in UTF-8..
    to .character data item that contains UTF-8 or UTF-16 data.
    the description of USUBSTR:
    The USUBSTR function returns a substring of the data in a
    character data item argument that contains UTF-8 or
    UTF-16 data.
    Add this line:
    The function type is alphanumeric or national, depending
    on the class of argument-1.
    Add UTF-16 to description of argument-1:
    argument-1
    Must be of class alphabetic, alphanumeric, or national.
    argument-1 must contain valid
    UTF-8 or UTF-16 encoded characters:
    - If argument-1 is of class alphabetic or alphanumeric,
    it must contain valid UTF-8 data.
    - If argument-1 is of class national, it must contain
    valid UTF-16 data.
    Rewrite the following paragraph to remove character string
    and add UTF-16:
    Change:
    Suppose argument-2 = n and argument-3 = m, the returned
    value is an alphanumeric character string that contains
    m UTF-8 characters in argument-1, starting with the
    nth character.
    To:
    Suppose argument-1 is alphabetic or alphanumeric,
    argument-2 = n and argument-3 = m, the returned value is
    an alphanumeric item that contains m UTF-8 characters
    from argument-1, starting with the nth character.
    Suppose argument-1 is a national data item, argument-2 = n
    and argument-3 = m, the returned value is a national item
    that contains m UTF-16 characters from argument-1,
    starting with the nth character.
    Add a second example:
    Example 2
    If B is a national item that contains the UTF-16
    value x'005400F6006200750072D858DC6B0073',
    the returned values are as follows:
    - USUBSTR(B 1 2) returns x'005400F6'
    - USUBSTR(B 2 1) returns x'00F6'
    - USUBSTR(B 2 2) returns x'00F60062'
    - USUBSTR(B 3 2) returns x'00620075'
    - USUBSTR(B 5 2) returns x'0072D858DC6B'
    - USUBSTR(B 6 2) returns x'D858DC6B0073'
    UPOS
    Change .character string argument that is encoded in.
    to .character data item that contains. and add UTF-16
    to the description of UPOS:
    The UPOS function returns an integer value that is equal
    to the index of the nth UTF-8 or UTF-16 character in a
    character data item argument that contains UTF-8 or UTF-16.
    argument-1
    Must be of class alphabetic, alphanumeric, or national.
    argument-1 must contain valid
    UTF-8 or UTF-16 encoded characters.
    - If argument-1 is of class alphabetic or alphanumeric,
    it must contain valid UTF-8 data.
    - If argument-1 is of class national, it must contain
    valid UTF-16 data.
    Suppose argument-1 is alphabetic or alphanumeric and
    argument-2=n, the returned value is the byte position
    of the nth UTF-8 character in argument-1.
    Suppose argument-1 is a national data item and
    argument-2=n, the returned value is the byte
    position of the nth UTF-16 character in argument-1.
    If argument-2 is not positive or if argument-2
    is larger than ULENGTH(argument-1),
    zero is returned. Otherwise, if argument-2=n,
    the returned value is the byte position
    in argument-1 where the nth UTF-8 or
    UTF-16 character starts.
    Add a second example:
    Example 2
    If B is a national data item that contains the UTF-16
    value x'005400F6006200750072D858DC6B0073',
    the returned values are as follows:
    - UPOS (B 1 ) returns 1
    - UPOS (B 2 ) returns 3
    - UPOS (B 3 ) returns 5
    - UPOS (B 4 ) returns 7
    - UPOS (B 5 ) returns 9
    - UPOS (B 6 ) returns 11
    - UPOS (B 7 ) returns 15
    ULENGTH
    Change .character string argument that is encoded in. to
    .character data item that contains. and add UTF-16 to the
    description of ULENGTH:
    The ULENGTH function returns an integer value that is equal to
    the number of UTF-8 or UTF-16 characters in a character data
    item argument that contains UTF-8 or UTF-16 data.
    argument-1
    Must be of class alphabetic, alphanumeric, or national.
    The returned value is the number of UTF-8 or UTF-16 characters
    in argument-1.
    - If argument-1 is of class alphabetic or alphanumeric, it must
    contain valid UTF-8 data.
    - If argument-1 is of class national, it must contain valid
    UTF-16 data.
    If argument-1 is a national data item that contains UTF-16 data
    and argument-1 contains surrogate pairs, each pair of low and
    high surrogates will be counted as one UTF-16 character. For
    example, if B is a national item that contains the UTF-16
    value x'005400F6006200750072D858DC6B0073', the returned value
    from ULENGTH(B) will be 7. Character X' D858DC6B. is counted
    as 1 UTF-16 character.
    UWIDTH
    Change .character string argument that is encoded in. to
    .character data item that contains. and add UTF-16 to the
    description of UWIDTH:
    The UWIDTH function returns an integer value that is equal to
    the width in bytes of the nth UTF-8 or UTF-16 character in a
    character data item argument that is encoded in UTF-8
    or UTF-16.
    argument-1
    Must be of class alphabetic, alphanumeric, or national.
    - If argument-1 is of class alphabetic or alphanumeric,
    it must contain valid UTF-8 data.
    - If argument-1 is of class national, it must contain valid
    UTF-16 data.
    argument-2
    Must be an integer.
    If argument-2 is not positive or if argument-2 is larger than
    ULENGTH(argument-1), zero is returned. Otherwise, if
    argument-2=n, the returned value is the width in bytes of the
    nth UTF-8 or UTF-16 character in argument-1.
    The returned value is an integer
    For example, if B is a national data item that contains the
    UTF-16 value x'005400F6006200750072D858DC6B0073', the returned
    values are as follows:
    - UWIDTH (B 1) returns 2
    - UWIDTH (B 2) returns 2
    - UWIDTH (B 2) returns 2
    - UWIDTH (B 3) returns 2
    - UWIDTH (B 4) returns 2
    - UWIDTH (B 5) returns 2
    - UWIDTH (B 6) returns 4
    - UWIDTH (B 7) returns 2
    REVERSE
    Change .character string. to .character value. in the
    description of REVERSE, and add national and UTF-16 details:
    The REVERSE function returns a character value of the same
    length as the argument, whose characters are the same as those
    specified in the argument except that they are in reverse order.
    For arguments of type national, character positions are
    reversed; UTF-16 characters that are surrogate pairs are
    treated as one character and UTF-16 characters that
    are not surrogate pairs are treated as one character.
    argument-1
    Must be class alphabetic, alphanumeric, or national and must be
    at least one character in length.
    - If argument-1 is of class alphabetic or alphanumeric, it must
    contain valid UTF-8 data.
    - If argument-1 is of class national, it must contain valid
    UTF-16 data.
    Add this example:
    Example 1
    If argument-1 is an alphanumeric data item that contains the
    UTF-8 value x'4BC3A4666572', the returned value is
    x'726566C3A44B'
    Example 2
    If argument-1 is a national data item that contains the UTF-16
    value x'0054 00F6 D847DDF3 0062 0075 0072 D858DC6B 0073',
    the returned value is
    x'0073 D858DC6B 0072 0075 0062 D847DDF3 00F6 0054.
    | End of changes for:                                         |
    | Enterprise COBOL for z/OS Language Reference, SC27-8713-01  |
    +-------------------------------------------------------------+
    

APAR Information

  • APAR number

    PI97434

  • Reported component name

    ENT COBOL FOR Z

  • Reported component ID

    5655EC600

  • Reported release

    620

  • Status

    CLOSED UR1

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2018-05-01

  • Closed date

    2018-05-28

  • Last modified date

    2019-06-04

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Modules/Macros

  • IGY8RWTU IGYCASMB IGYCCBE  IGYCCCRT IGYCCICS IGYCCSRV IGYCDGEN
    IGYCDIAG IGYCDMAP IGYCDOPT IGYCEN$0 IGYCEN$1 IGYCEN$2 IGYCEN$3
    IGYCEN$4 IGYCEN$5 IGYCEN$8 IGYCEN$D IGYCEN$R IGYCFGEN IGYCFREE
    IGYCINIT IGYCJA$0 IGYCJA$1 IGYCJA$2 IGYCJA$3 IGYCJA$4 IGYCJA$5
    IGYCJA$8 IGYCJA$D IGYCJA$R IGYCLIBH IGYCLIBO IGYCLIBR IGYCLSTR
    IGYCLVL0 IGYCLVL1 IGYCLVL2 IGYCLVL3 IGYCLVL8 IGYCMALL IGYCOB2
    IGYCOPI  IGYCOPT  IGYCOSCN IGYCPGEN IGYCRCTL IGYCRDPR IGYCRDSC
    IGYCREAL IGYCRWT  IGYCSCAN IGYCSIMD IGYCUE$0 IGYCUE$1 IGYCUE$2
    IGYCUE$3 IGYCUE$4 IGYCUE$5 IGYCUE$8 IGYCUE$D IGYCUE$R IGYCXREF
    IGYDRV   IGYEQCWI IGYMSGE  IGYMSGK  IGYMSGT  IGYQCBE  IGYZQDRV
    IGYZQENU IGYZQJPN
    

Publications Referenced
SC27871301    

Fix information

  • Fixed component name

    ENT COBOL FOR Z

  • Fixed component ID

    5655EC600

Applicable component levels

  • R620 PSY UI56120

       UP18/06/01 P F805

  • R621 PSY UI56121

       UP18/06/01 P F805

  • R622 PSY UI56122

       UP18/06/01 P F805

  • R62H PSY UI56123

       UP18/06/01 P F805

Fix is available

  • Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SS6SG3","label":"Enterprise COBOL for z\/OS"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"620","Edition":"","Line of Business":{"code":"LOB35","label":"Mainframe SW"}}]

Document Information

Modified date:
12 December 2023