IBM Support

OA67537: TCPIP LOOP RUNNING INBOUND SRB WHEN ISTLLCIE RETURNS TO INCORRECT CALLER

A fix is available

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • Inbound packet processing running under a SRB in the TCPIP
    address space abends.  The ISTZRM01 FRR gets control and
    releases the CPU lock.  The R13 register save area (RSA) address
    is extracted and retry is done to IERETRY in ISTLLCIE.  The
    IERETRY routine uses the extracted RSA to obtain the caller's
    RSA address and restores the caller's registers and branches to
    R14 to return to the caller.
    When the inbound SRB started it obtained the CPU lock and used
    the DWA associated with the CPU number as its RSA.  When the FRR
    released the CPU lock the SRB it became interruptible.  Before
    IERETRY obtained the caller's RSA address an interrupt occurred
    on the CPU for a VTAM managed network device, causing the DWA to
    be used for the disabled interrupt code path.  This caused the
    RSA to be updated with the registers at the time the VTAM module
    was called to process the interrupt.  The RSA in the DWA now has
    an R14 value that points back to the SLIH (second level
    interrupt handler) and the unit of work that was interrupted and
    the SLIH will return control to is the original inbound SRB that
    abended.
    The resulting symptoms from using the DWA RSA after releasing
    the CPU lock are unpredictable.  The reporting problem resulted
    in an enabled loop where the inbound SRB branched back to the
    SLIH R14 and the SLIH restored the interrupted PSW so the code
    was stuck loading the interrupted PSW and giving control which
    loaded the interrupted PSW.  Interrupts were able to run because
    the SRB was enabled at this point, but every time an interrupt
    occurred it was always this same PSW address so the loop resumed
    as soon as the interrupt exited.
    
    VERIFICATION STEPS:
    If the loop occurs a dump that includes the system trace table
    will show an enabled loop for an SRB with TCPIP as the home
    address space and the PSW in IEAVEIO.
    
    ADDITIONAL SYMPTOMS:
    Work may be delayed if the LPAR is capped due to the loop.
    Omproute delayed sending HELLO and loss of adjacency due to
    dispatching delay causing routes to be lost.
    

Local fix

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * All users of the IBM Communications Server for z/OS 2.5 and  *
    * 3.1                                                          *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * Loop in the second level interrupt handler occurs when an    *
    * abend processing an OSA incorrectly releases the CPU lock.   *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * Apply PTF                                                    *
    ****************************************************************
    ISTLLCIE is running for inbound OSA packet processing and an
    abend occurs.  The recovery routine, ISTZRM01, gets control and
    releases the CPU lock and passes the R13 value back to the retry
    routine in ISTLLCIE.  The retry routine accesses the R13 value
    that is only valid while the CPU lock is held and uses this
    value to restore the caller's registers.  At the time the R13 is
    accessed that storage area has been reused by the second level
    interrupt handler (SLI), resulting in it getting control with
    the CPU lock released.
    

Problem conclusion

  • ISTZRM01 has been amended to extract the caller's R13 address
    before releasing the CPU lock and passing the extracted value to
    the retry routine in ISTLLCIE.
    

Temporary fix

  • *********
    * HIPER *
    *********
    

Comments

APAR Information

  • APAR number

    OA67537

  • Reported component name

    VTAM MVS/ESA

  • Reported component ID

    569511701

  • Reported release

    250

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    YesHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2025-02-19

  • Closed date

    2025-05-02

  • Last modified date

    2025-06-02

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    UJ97105 UJ97106 OA68080

Modules/Macros

  • ISTZRM01 ISTLLCFB ISTLLCIE ISTS2RMQ ISTSRRMQ IUTLLRTP ISTLLCAD
    ISTLLCOO ISTLLCRB ISTSRIS8 ISTTSCCN
    

Fix information

  • Fixed component name

    VTAM MVS/ESA

  • Fixed component ID

    569511701

Applicable component levels

  • R310 PSY UJ97106

       UP25/05/28 P F505 ¢

  • R250 PSY UJ97105

       UP25/05/28 P F505 ¢

Fix is available

  • Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSSN3L","label":"z\/OS Communications Server"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"250","Line of Business":{"code":"LOB35","label":"Mainframe SW"}}]

Document Information

Modified date:
17 June 2025