IBM Support

PM76041: AFTER A CANCEL OF RRS, WEBSPHERE MQ RETURNS MESSAGE CSQ5026E. DETECTION OF RRS FAILURE TAKES LONGER THAN EXPECTED

A fix is available

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • RRS is canceled after which MQ returns message
    CSQ5026E UNABLE TO ACCESS DB2, RRS IS NOT AVAILABLE.
    DIS CFSTATUS returns CSQM153E DB2 NOT AVAILABLE.
    CSQX483E DB2 NOT AVAILABLE is generated multiple times.
    .
    
    Per Level 3 when RRS is canceled there are two ways that MQ's
    DB2 tasks can detect this :
     1) An MQ exit is driven when RRS has terminated. A flag is set
        to record that RRS is not available. On the master DB2
        thread in MQ a monitor task runs every 5 seconds to check
        the state of the DB2 connection and of other DB2 tasks. When
        this task is run, if the flag is on to show that RRS is
        unavailable then the DB2 tasks are disconnected from DB2 and
        attempt to reconnect.
        When RRS starts again, the flag is reset.
        If RRS stops and restarts within 5 seconds then the monitor
        task may not have run while the flag is on, and the DB2
        tasks do not notice that RRS has been restarted.
     2) If a DB2 request is processed on one of the DB2 worker
        tasks and receives a return code which indicates that RRS
        is unavailable this notifies the monitor task to disconnect
        and reconnect.
    .
    The exact return code given by DB2 depends on the state of the
    connection and whether RRS has restarted. In some cases the
    return code given does not indicate an RRS error so MQ does
    not detect that the DB2 connection needs to be re-created.
    This apar is taken to determine if this type of failure
    detection can be made more reliable.
    

Local fix

  • N/A
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED: All users of WebSphere MQ for z/OS version 7 *
    *                 Release 0 Modification 1 and Release 1       *
    *                 Modification 0 with Queue Sharing Groups.    *
    ****************************************************************
    * PROBLEM DESCRIPTION: After a failure of RRS, the DISPLAY     *
    *                      CFSTATUS command can fail with message  *
    *                      CSQM153E DB2 NOT AVAILABLE.             *
    *                      This error may continue even after RRS  *
    *                      has restarted.                          *
    ****************************************************************
    * RECOMMENDATION:                                              *
    ****************************************************************
    When RRS abends, the DB2 connections used by MQ to access QSG
    related information may become stale resulting in all DB2
    requests failing.
    
    Normally, MQ detects this situation and reconnects to DB2,
    as indicated by messages CSQ5026E, CSQ5019I and CSQ5001I. In
    some cases MQ does not detect the failure of RRS promptly,
    so the reconnect does not occur immediately. The reconnect may
    happen after an extended period or may not happen at all.
    
    While in this state, requests for QSG data stored in MQ's DB2
    tables will fail, reporting DB2 errors. This can impact commands
    and applications accessing shared queues and CF structures.
    
    For example:
    
    DISPLAY CFSTATUS(*)
    fails with
    CSQM153E xxxx CSQMDSCF DB2 NOT AVAILABLE
    
    DISPLAY QLOCAL(*) QSGDISP(SHARED)
    fails with
    CSQM294I xxxx CSQMDRTC CANNOT GET INFORMATION FROM DB2
    

Problem conclusion

  • Changes have been made to ensure that when MQ detects an RRS
    failure, the DB2 tasks in MQ are always notified so that DB2
    reconnect processing is performed promptly.
    010Y
    100Y
    CSQARIB
    CSQ3RRSI
    CSQ3RRSM
    CSQ3RRSX
    CSQ5CONN
    CSQ5MONR
    

Temporary fix

  • *********
    * HIPER *
    *********
    

Comments

APAR Information

  • APAR number

    PM76041

  • Reported component name

    WMQ Z/OS V7

  • Reported component ID

    5655R3600

  • Reported release

    010

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    YesHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2012-10-29

  • Closed date

    2012-12-13

  • Last modified date

    2013-02-04

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    UK90289 UK90290

Modules/Macros

  • CSQARIB  CSQ3RRSI CSQ3RRSM CSQ3RRSX CSQ5CONN
    CSQ5MONR
    

Fix information

  • Fixed component name

    WMQ Z/OS V7

  • Fixed component ID

    5655R3600

Applicable component levels

  • R010 PSY UK90289

       UP13/01/16 P F301 Ž

  • R100 PSY UK90290

       UP13/01/16 P F301 Ž

Fix is available

  • Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.

[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG19M","label":"APARs - z\/OS environment"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"7.0.1","Edition":"","Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
04 February 2013