IBM Support

PH33239: ABN=5C6-00C5105B OCCURS IN CSQEDSS2 DUE TO AN INCONSISTENCY BETWEEN THE CF CONTENTS AND THE SMDS SPACE MAP

A fix is available

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • A couple of queue managers in the Queue Sharing Group (QSG)
    failed with the following messages:
    
    Queue manager CSQ2:
    ------------------
    CSQV086E CSQ2     QUEUE MANAGER ABNORMAL TERMINATION
    REASON=00E50702
    
    IEA794I SVC DUMP HAS CAPTURED:
      DUMPID=002 REQUESTED BY JOB (CSQ2MSTR)
      DUMP TITLE=CSQ2,ABN=5C6-00C5105B,U=SYSOPR  ,C=MQ900.900.CFM
      -CSQEDSS2,M=CSQGFRCV,LOC=CSQELPLM.CSQEDSS2+000018FA
    
    CSQEDSS2+18FA is in routine free_to_map, which attempts to free
    the SMDS block associated with a deleted message. This message
    is likely to be the one that was being processed and led to the
    5C6-00C94522 abend that occurred on another queue manager,
    which would have queued the message for recovery processing by
    CSQ2 (the owner of the SMDS containing the blocks for that
    message).
    
    Queue manager CSQ1:
    ------------------
    CSQY291E CSQWDSDM SDUMPX FAILED,
    RC=00000B08,CSQ1,ABN=5C6-00C5105B,
    LOC=CSQELPLM.CSQEDSS2+000018FA
    
    CSQV086E CSQ1     QUEUE MANAGER ABNORMAL TERMINATION
    REASON=00E50702
    
    After the queue manager restarted, it failed with:
    
    CSQE252I CSQ1 CSQEDSS4 SMDS(CSQ1) CFSTRUCT(APPLICATION1)
      data set <SMDS name> space map will be rebuilt by
      scanning the structure
    
    IXL016I CONNECTOR CSQEQSG1CSQ102 TO STRUCTURE QSG1CSQ_ADMIN
      TERMINATING:
      JOB CSQ1MSTR ASID 0133 REQUESTED DISCONNECT REASON=FAILURE.
    
    CSQV086E CSQ1     QUEUE MANAGER ABNORMAL TERMINATION
    REASON=00C94510
    
    IEA794I SVC DUMP HAS CAPTURED:
      DUMPID=006 REQUESTED BY JOB (CSQ1MSTR)
      DUMP TITLE=CSQ1,ABN=5C6-00C5105B,U=SYSOPR  ,C=MQ900.900.CFM
      -CSQEDSS2,M=CSQGFRCV,LOC=CSQELPLM.CSQEDSS2+000015E0
    
    CSQEDSS2+15E0 is in routine map_to_free.
    
    To get the CSQ1 queue manager to restart, an MQ  RESET command
    was issued in another queue manager in the QSG:
      /cpf RESET CFSTRUCT(structure-name) ACTION(FAILED)
    where "cpf" is the command prefix for the queue manager, and
    "structure-name" is the name of the application structure
    listed in the CSQE252I message.
    
    Logrec shows that an ABEND5C6-00C94522 abend
    (CSQI_LARGE_MESSAGE_ERROR) occurred on other queue managers
    (CSQ3 and CSQ4) in the QSG at the same time.
    
    The 5C6-00C94522 dumps were not available in the reported case,
    so the underlying cause of those abends can not be determined.
    
    This APAR is to correct the recovery action to try and avoid
    terminating the queue manager in this situation.
    

Local fix

  • N/A
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED: All users of IBM MQ for z/OS Version 9       *
    *                 Release 0 Modification 0,                    *
    *                 Release 1 Modification 0, and                *
    *                 Release 2 Modification 0.                    *
    ****************************************************************
    * PROBLEM DESCRIPTION: Abend 5C6-00C5105B occurs in CSQEDSS2   *
    *                      when a SMDS contains a damaged message, *
    *                      and is followed by abnormal queue       *
    *                      manager termination 6C6.                *
    *                                                              *
    *                      When the damaged message is found       *
    *                      during backup cfstruct or get           *
    *                      processing the abend is preceded by     *
    *                      CSQI037I and abend 5C6-00C94522 on the  *
    *                      queue manager discovering the message   *
    *                      (this can be a different queue manager  *
    *                      to the owning queue manager that        *
    *                      terminates).                            *
    *                                                              *
    *                      The 5C6-00C5105B can also occur during  *
    *                      queue manager startup, causing abnormal *
    *                      termination and preventing the queue    *
    *                      manager from starting.                  *
    *                                                              *
    *                      MQSMDS/K                                *
    ****************************************************************
    When a 'damaged' message (one where the CF and SMDS entries
    are inconsistent with each other) is identified during get
    processing, it abends 5C6-00C94522 and attempts to handle the
    bad message.
    
    If the message is persistent the structure will be failed,
    allowing any persistent messages to be recovered from the logs.
    
    If the message is non-persistent it is deleted, however an
    error in this deletion processing causes the SMDS blocks to
    be released twice - this is detected by the owning queue
    manager (i.e. the queue manager where the message was put to
    the queue), which abends 5C6-00C5105B and terminates.
    
    If a queue manager terminates, and any such 'damaged' messages
    exist on it's SMDS, during startup abend 5C6-00C5105B occurs
    while rebuilding the space map, leading to startup failing and
    the queue manager terminating again.
    

Problem conclusion

  • CSQIMGES is changed to correctly delete the CF entry only when
    detecting a damaged non-persistent message and abending
    5C6-00C94522, preventing the queue manager that owns the message
    abending 5C6-00C5105B and terminating abnormally.
    
    CSQEDSS4 is changed to retry 5C6-00C5105B abends (although a
    dump will still be captured) when rebuilding the SMDS spacemap,
    causing the SMDS to be marked as failed rather than terminating
    the queue manager. This allows the structure to be failed and
    recovered to recover any persistent messages from the logs.
    

Temporary fix

Comments

APAR Information

  • APAR number

    PH33239

  • Reported component name

    IBM MQ Z/OS V9

  • Reported component ID

    5655MQ900

  • Reported release

    000

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    YesHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2021-01-11

  • Closed date

    2021-06-09

  • Last modified date

    2021-08-02

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    UI75778 UI75779 UI75780

Modules/Macros

  • CSQEDSS4 CSQIMGES
    

Fix information

  • Fixed component name

    IBM MQ Z/OS V9

  • Fixed component ID

    5655MQ900

Applicable component levels

  • R000 PSY UI75780

       UP21/07/19 P F107 ¢

  • R100 PSY UI75779

       UP21/07/19 P F107 ¢

  • R200 PSY UI75778

       UP21/07/19 P F107 ¢

Fix is available

  • Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.

[{"Line of Business":{"code":"LOB36","label":"IBM Automation"},"Business Unit":{"code":"BU053","label":"Cloud \u0026 Data Platform"},"Product":{"code":"SSYHRD","label":"IBM MQ"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"9.0"}]

Document Information

Modified date:
03 August 2021