IBM Support

IT23692: Queue manager hang with FDCs from amqzlaa0 with ProbeIds XC130004 AQ051000 XC307100 when using async consumers

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

 

APAR status

  • Closed as program error.

Error description

  • The FDCs shows a pattern similar to the following
    
    amqzlaa0 XC130004 xehExceptionHandler    STOP
                              OK
    amqzlaa0 AQ051000 aqsStartQOp            STOP_ALL
                          OK
    amqzlaa0 XC307100 xlsRequestMutex        xecL_W_LONG_LOCK_WAIT
             OK
    
    The initial FDC shows a back trace as follows:
    
    | Program Name      :- amqzlaa0
    :
    | Process           :- <pid>
    | Process(Thread)   :- <pid-tid>
    | Thread            :- <tid>    SharedAgent
    | QueueManager      :- <qmg-name>
    | UserApp           :- FALSE
    :
    | Major Errorcode   :- STOP
    | Minor Errorcode   :- OK
    | Probe Type        :- HALT6109
    | Probe Severity    :- 1
    | Probe Description :- AMQ6109: An internal WebSphere MQ error
    has occurred.
    | FDCSequenceNumber :- 0
    | Arith1            :- 11 (0xb)
    | Comment1          :- SIGSEGV: invalid address
    permissions(0x7fa92bfff000)
    
    
    MQM Function Stack
    zlaMainThread
    zlaProcessMessage
    zlaProcessMQIRequest
    zlaMQGETM
    zsqMQGETM
    kpiMQGETM
    kqiWaitForMessage
    kqiExpiration
    kqiPut1Report
    kqiPutIt
    kqiPutMsgSegments
    apiPutMessage
    aqmPutMessage
    aqhPutMessage
    aqqWriteMsg
    aqqWriteMsgData
    aqlLogPutPart
    aqlLogPutData
    almLogIt
    hlgWriteLogRecord
    mqlWriteLogRecord
    xcsFFST
    
    If this problem occurs, the error log might also indicate:
    
    AMQ7472: Object xyz, type queue damaged.
    

Local fix

Problem summary

  • ****************************************************************
    USERS AFFECTED:
    Users of IBM MQ running asynchronous message consumers which
    encounter large expired messages and have requested report
    messages.
    
    
    Platforms affected:
    MultiPlatform
    
    ****************************************************************
    PROBLEM DESCRIPTION:
    An FDC with ProbeId XC130004 for a SIGSEGV event can occur when
    writing a report message during asynchronous handling of large
    expired messages.
    
    A typical FDC footprint can be
    
      Probe Id          :- XC130004&#09;
      Component         :- xehExceptionHandler&#09;
      Effective UserID  :- <userid> (mqm)&#09;
      Real UserID       :- <realUserid> (<realUseridName>)&#09;
      Program Name      :- amqzlaa0
      Thread            :- nn    SharedAgent&#09;
      QueueManager      :- <QMNAME>&#09;
      Major Errorcode   :- STOP&#09;
      Minor Errorcode   :- OK&#09;
      Probe Description :- AMQ6109: An internal WebSphere MQ error
    has occurred.&#09;
      FDCSequenceNumber :- 0&#09;
      Arith1            :- 11 (0xb)&#09;
      Comment1          :- SIGSEGV: invalid address&#09;
    permissions(0x7fa92bfff000)&#09;
     &#09;
      O/S Call Stack for current thread&#09;
      ...&#09;
     &#09;
      MQM Function Stack&#09;
      zlaMainThread
      zlaProcessMessage&#09;
      zlaProcessMQIRequest&#09;
      zlaMQGETM&#09;
      zsqMQGETM&#09;
      kpiMQGETM&#09;
      kqiWaitForMessage&#09;
      kqiExpiration&#09;
      kqiPut1Report&#09;
      kqiPutIt&#09;
      kqiPutMsgSegments&#09;
      apiPutMessage&#09;
      aqmPutMessage&#09;
      aqhPutMessage&#09;
      aqqWriteMsg&#09;
      aqqWriteMsgData&#09;
      aqlLogPutPart&#09;
      aqlLogPutData&#09;
      almLogIt&#09;
      hlgWriteLogRecord&#09;
      mqlWriteLogRecord&#09;
      xcsFFST
    
    Other related FDCs can also occur such as
    
    amqzlaa0  AQ051000 aqsStartQOp         STOP_ALL
    OK&#09;
    amqzmuc0 XC307100 xlsRequestMutex     xecL_W_LONG_LOCK_WAIT OK
    
    The queue manager can become unresponsive.
    

Problem conclusion

  • The IBM MQ code has been changed to correctly handle large
    expired messages processed by an asynchronous consumer.
    
    ---------------------------------------------------------------
    The fix is targeted for delivery in the following PTFs:
    
    Version    Maintenance Level
    v8.0       8.0.0.9
    v9.0 CD    9.0.5
    v9.0 LTS   9.0.0.4
    
    The latest available maintenance can be obtained from
    'WebSphere MQ Recommended Fixes'
    http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006037
    
    If the maintenance level is not yet available information on
    its planned availability can be found in 'WebSphere MQ
    Planned Maintenance Release Dates'
    http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006309
    ---------------------------------------------------------------
    

Temporary fix

Comments

APAR Information

  • APAR number

    IT23692

  • Reported component name

    IBM MQ BASE MP

  • Reported component ID

    5724H7251

  • Reported release

    800

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2018-01-11

  • Closed date

    2018-01-31

  • Last modified date

    2018-01-31

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    IBM MQ BASE MP

  • Fixed component ID

    5724H7251

Applicable component levels

[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SSYHRD","label":"IBM MQ"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"8.0.0.0","Edition":"","Line of Business":{"code":"LOB36","label":"IBM Automation"}}]

Document Information

Modified date:
31 January 2018