IBM Support

PI85691: WMQ 900 RESILIENCY IMPROVEMENT TO DETECT LOOPING WITHIN STORAGEMANAGEMENT FOR MQ

A fix is available

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • WMQ 900 RESILIENCY IMPROVEMENT TO DETECT LOOPING WITHIN
    STORAGE MANAGEMENT for MQ
    .
    Additional Symptom(s) Search Keyword(s):
    Csect CSQSCON2 Lmod CSQSLD1 storage contraction loop
    Performance High CPU
    Thread 006.SMFACL02 SMFACL02 Latch SMCPVT SMCPHB
    Other threads hang hung hanging wait waiting
    

Local fix

Problem summary

  • ****************************************************************
    * USERS AFFECTED: All users of IBM MQ for z/OS Version 9       *
    *                 Release 0 Modification 0.                    *
    ****************************************************************
    * PROBLEM DESCRIPTION: Following an overlay of MQ owned common *
    *                      storage, CSQSCON2 loops running a SHB   *
    *                      chain while holding the PHB latch.      *
    *                      As the latch is never released, this    *
    *                      causes other MQ tasks to hang waiting   *
    *                      on the latch, leading to application    *
    *                      hangs and timeouts.                     *
    ****************************************************************
    An overlay of MQ owned common storage occurred due to an error
    in another product/application running on the same system,
    leading to a loop being introduced in one of MQ's SHB chains for
    control blocks in common storage.
    
    When storage contraction was invoked (this occurs every 10
    minutes for common storage) to free any ECSA no longer required
    by the queue manager, CSQSCON2 obtained the PHB latch, and
    evaluated each SHB for control blocks in common storage.
    While evaluating the overlaid SHB, the loop that the overlay
    had introduced resulted in CSQSCON2 looping continually while
    still holding the latch.
    As this latch is commonly required by other MQ tasks (both
    internal, and for applications connected to MQ), this
    resulted in MQ and application hangs and timeouts occurring.
    This continued until the queue manager was identified as the
    source of the hanging tasks and canceled by an operator.
    

Problem conclusion

  • CSQSCON2 is changed to be more resilient to storage overlays
    that would currently cause indefinite loops. While contracting
    storage, CSQSCON2 will now detect chains that are looping, and
    if found, will terminate the queue manager with REASON=00E2000E.
    

Temporary fix

Comments

APAR Information

  • APAR number

    PI85691

  • Reported component name

    MQ Z/OS V9

  • Reported component ID

    5655MQ900

  • Reported release

    000

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    YesHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2017-08-09

  • Closed date

    2017-09-28

  • Last modified date

    2017-11-01

  • APAR is sysrouted FROM one or more of the following:

    PI85379

  • APAR is sysrouted TO one or more of the following:

    UI50701

Modules/Macros

  • CSQSCON2
    

Fix information

  • Fixed component name

    MQ Z/OS V9

  • Fixed component ID

    5655MQ900

Applicable component levels

  • R000 PSY UI50701

       UP17/10/12 P F710 ¢

Fix is available

  • Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.

[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SSYHRD","label":"IBM MQ"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"9.0","Edition":"","Line of Business":{"code":"LOB45","label":"Automation"}}]

Document Information

Modified date:
01 November 2017