IBM Support

PH12062: MQ: DIAGNOSTIC APAR FOR A HANG AFTER A ROB PROTOCOL VIOLATION

A fix is available

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • MQ threads hung due to latch contention where the latch was no
    longer held, but the waiter was not correctly resumed.
    
    Symptoms of the problem vary depending on which threads are
    waiting for the latch.  For the reported case, messages built
    up on SYSTEM.CLUSTER.TRANSMIT.QUEUE because the channel was the
    latch waiter that was not properly resumed.
    
    The ROB control block for the waiting task had evidence that
    the resume had been attempted. The MVS RELEASE may have been
    done for a different Pause Element Token (PET) than was used
    for the pause. Unfortunately the point of detection is a long
    time after the problem occurred so it is difficult to fully
    investigate.
    
    This is similar to the situation in DB2 APAR PI55964 which
    provides additional diagnostic information to detect errors in
    the suspend and resume protocol. This APAR will port that code
    to MQ.
    
    Additional Symptom(s) Search Keyword(s):
    wait waiting hanging
    

Local fix

  • There is not a preventive work-around. The problem will be
    cleared by a queue manger recycle.
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED: All users of IBM MQ for z/OS Version 9       *
    *                 Release 0 Modification 0 and Release 1       *
    *                 Modification 0.                              *
    ****************************************************************
    * PROBLEM DESCRIPTION: An MQ task that had been suspended by   *
    *                      csqvsusp processing was not resumed     *
    *                      as expected by csqvresm causing the     *
    *                      task to hang.                           *
    *                      Dump analysis was not able to determine *
    *                      the cause of the hang.                  *
    *                                                              *
    *                      Additionally, in some situations        *
    *                      diagnostic dumps requested by MQ are    *
    *                      not successfully captured.              *
    ****************************************************************
    A task issued CSQVSUSP to suspend processing until resumed by
    another task.
    CSQVRESM was issued for the associated ROB, however the
    suspended task remained paused.
    A dump taken when the hang was observed did not give sufficient
    information to determine why the resume request had not released
    the suspended task.
    
    Additional keywords: CSQWDSDM 0C4 S0C4
    

Problem conclusion

  • CSQVSRX is enhanced to check for unexpected changes in the state
    of a ROB, such as would be seen if a ROB was invalidly reused.
    If an invalid state is seen, a diagnostic dump is taken by
    CSQVSRX with abend code 5C6-00E5xxA1, where xx depends on the
    invalid state detected.
    
    In addition, an error in recovery processing that prevented
    some MQ dumps, included those added by this APAR, from being
    captured is fixed, allowing such diagnostic dumps to be
    produced.
    

Temporary fix

Comments

APAR Information

  • APAR number

    PH12062

  • Reported component name

    IBM MQ Z/OS V9

  • Reported component ID

    5655MQ900

  • Reported release

    000

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2019-05-14

  • Closed date

    2019-06-28

  • Last modified date

    2019-08-01

  • APAR is sysrouted FROM one or more of the following:

    PI96988

  • APAR is sysrouted TO one or more of the following:

    UI63973 UI63974

Modules/Macros

  • CSQVSRRX CSQVSRX  CSQWDSD0 CSQWDSDM
    

Fix information

  • Fixed component name

    IBM MQ Z/OS V9

  • Fixed component ID

    5655MQ900

Applicable component levels

  • R000 PSY UI63973

       UP19/07/24 P F907

  • R100 PSY UI63974

       UP19/07/23 P F907

Fix is available

  • Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.

[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SSYHRD","label":"IBM MQ"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"9.0","Edition":"","Line of Business":{"code":"LOB45","label":"Automation"}}]

Document Information

Modified date:
01 August 2019