IBM Support

PM88682: INCONSISTENCY BETWEEN THE CF AND DB2 AS REPORTED BY CSQI033E WHICH CAUSES STRUCTURE TO BE MARKED AS FAILED.

A fix is available

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as unreproducible in next release.

Error description

  • CSQI033E was issued as there was a BLOB missing from DB2 for a
    message put to a queue. The message was persistent and being put
    by a receiver channel connected to the queue-manager that
    received message CSQI033E. At the time the message was being
    put, the queue-manager ran out of active log datasets (CSQJ111A)
    due to problems allocating archive log datasets (virtual tape
    issues in this case). This caused the put processing to be
    suspended (after writing the BLOB to DB2 and the IRH7 to the CF
    structure), as it could not continue until the log data has
    been written to the active log. While the put processing was
    suspended, the CHIN was cancelled. This resulted in recovery
    processing trying to backout the put of the message. The BLOB
    was deleted from DB2, but during the processing for deleting the
    IRH7 from the CF structure, CSQETHDP issued abend 5C6-00C5101D
    (due to the TROP and TRQS already being deleted as the MQPUT was
    out-of-syncpoint). This therefore left the CF structure and DB2
    in an inconsistent state, which resulted in the subsequent
    CSQI033E and the structure being failed.
    

Local fix

Problem summary

  • ****************************************************************
    * USERS AFFECTED: All users of WebSphere MQ for z/OS Version 7 *
    *                 Release 0 Modification 1.                    *
    ****************************************************************
    * PROBLEM DESCRIPTION: Abend 5C6-00C5101D in CSQETHDP occurs   *
    *                      when an application is canceled while   *
    *                      putting a large (>511k) persistent      *
    *                      message to a shared queue out of        *
    *                      syncpoint.                              *
    *                                                              *
    *                      A subsequent attempt to get this        *
    *                      message leads to abend 5C6-00C9FEEE     *
    *                      occurring in CSQIMGES. This also causes *
    *                      CSQI033E to be issued, reporting an     *
    *                      inconsistency between DB2 and the CF,   *
    *                      and causes the application structure    *
    *                      containing the shared queue to be       *
    *                      failed.                                 *
    ****************************************************************
    * RECOMMENDATION:                                              *
    ****************************************************************
    During the put of a large DB2 message, CSQEMPUT was called to
    put the last BLOB segment and CF list entry for the message.
    After the CF list entry was successfully written, the TROP
    representing the operation was deleted (because the message was
    put out of syncpoint), and the corresponding TRQS were freed and
    then the log entries were forced to the logs.
    While the log force was in progress the application was
    canceled, and CSQIMPUS's prr was called.
    The prr referenced the TRQS to determine what
    recovery actions were required, and because this had been
    reused, it incorrectly determined that the DB2 BLOBs should
    be deleted and the TROP released.
    The attempt to free the TROP for the second time led to the
    5C6-00C5101D abend occurring.
    After the prr had completed, the message was left in an
    inconsistent state, with the CF list entry still existing
    and the corresponding data in DB2 deleted.
    When an attempt is made to get the message, this inconsistency
    causes ths subsequent 5C6-00C9FEEE abend in CSQIMGES and
    causes the structure to be failed.
    

Problem conclusion

Temporary fix

  • *********
    * HIPER *
    *********
    

Comments

  • CSQIMPUS's prr is corrected to no longer attempt to use the TRQS
    after it has been freed, preventing it from incorrectly deleting
    the DB2 BLOB's and releasing the TROP for a second time in this
    situation.
    

APAR Information

  • APAR number

    PM88682

  • Reported component name

    WMQ Z/OS V7

  • Reported component ID

    5655R3600

  • Reported release

    010

  • Status

    CLOSED UR1

  • PE

    NoPE

  • HIPER

    YesHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2013-05-08

  • Closed date

    2013-05-17

  • Last modified date

    2013-07-03

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    UK94393

Modules/Macros

  • CSQEMPU1 CSQIMPUS
    

Fix information

  • Fixed component name

    WMQ Z/OS V7

  • Fixed component ID

    5655R3600

Applicable component levels

  • R010 PSY UK94393

       UP13/06/14 P F306 Ž

Fix is available

  • Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.

[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG19M","label":"APARs - z\/OS environment"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"7.0.1","Edition":"","Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
03 July 2013