IBM Support

PM73428: QUEUE-MANAGERS HUNG IN A QSG.

A fix is available

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • The problem occurs when a serialized application (e.g. an
    shared channel) puts to a topic (in this case via a qalias)
    which has a subscriber using a shared queue. If there are
    multiple messages put in the same unit of work, then there
    will be multiple eSAL control blocks added to the CSQ_ADMIN
    structure (this is incorrect) and only one of these will be
    deleted when the unit of work is committed.
    

Local fix

Problem summary

  • ****************************************************************
    * USERS AFFECTED: All users of WebSphere MQ for z/OS Version 7 *
    *                 Release 0 Modification 1 and Release 1       *
    *                 Modification 0.                              *
    ****************************************************************
    * PROBLEM DESCRIPTION: A serialised application (for example,  *
    *                      a shared receiver channel) loops while  *
    *                      accessing the Admin structure due to    *
    *                      orphaned SALE control blocks.           *
    *                      The application will experience high    *
    *                      cpu, and queue managers in the queue    *
    *                      sharing group can hang.                 *
    ****************************************************************
    * RECOMMENDATION:                                              *
    ****************************************************************
    A SALE control block is added to the admin structure for each
    queue used for put/get by a serialized application. When the
    queue is the destination for a subscription, the put takes place
    within a nested unit of work, and so any SALE is associated with
    the nested uow rather than the application's uow. When the work
    performed under the nested uow completes successfully, any
    resources associated with it should be moved to be associated
    with the application's uow. This does not occur correctly for
    the SALE control blocks, resulting in the SALE's being orphaned.
    
    If CSQESAPP is called to check for existing SALEs, it will
    attempt to read the SALEs from the admin structure, however
    if there are more SALEs with any particular key than can
    fit in the buffer (for example, if there are multiple orphaned
    SALEs due to publications to shared queues), an error in the
    retry logic leads to CSQESAPP looping indefinitely.
    The loop occurs while a latch is held on the application's ethr,
    causing any other tasks requiring this latch to hang. This can
    lead to checkpoint processing hanging.
    

Problem conclusion

  • CSQERSAV is changed to correctly change any SALEs associated
    with a publish/subscribe nested unit of work to be associated
    with the application unit of work, and so prevent the SALEs
    being orphaned.
    
    CSQESAPP is changed to correct the retry logic so that the loop
    will no longer occur when there are more SALEs with the same
    key than will fit in its buffer.
    010Y
    100Y
    CSQERSAV
    CSQESAPP
    CSQE197M
    

Temporary fix

  • *********
    * HIPER *
    *********
    

Comments

APAR Information

  • APAR number

    PM73428

  • Reported component name

    WMQ Z/OS V7

  • Reported component ID

    5655R3600

  • Reported release

    010

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    YesHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2012-09-21

  • Closed date

    2012-11-29

  • Last modified date

    2013-02-04

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    UK83853 UK83854

Modules/Macros

  • CSQERSAV CSQESAPP CSQE197M
    

Fix information

  • Fixed component name

    WMQ Z/OS V7

  • Fixed component ID

    5655R3600

Applicable component levels

  • R010 PSY UK83853

       UP13/01/16 P F301 Ž

  • R100 PSY UK83854

       UP13/01/16 P F301 Ž

Fix is available

  • Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.

[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG19M","label":"APARs - z\/OS environment"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"7.0.1","Edition":"","Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
04 February 2013