IBM Support

IT30677: MQ Telemetry channel stops and generates FDC with Probe ID XR014005 when multiple MQTT clients disconnect simultaneously

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

 

APAR status

  • Closed as program error.

Error description

  • A large number of MQTT clients (more than 1000), each specifying
    a last will and testament (LWT) message, are connected to the MQ
    Telemetry server. The MQTT clients are then disconnected at the
    same time. Intermittently, this results in the MQ Telemetry
    channel associated with the MQ Telemetry server stopping, and an
    FDC being generated. The FDC contains probe identifier XR014005
    and the following "Exception cause":
    
    :---------------------------------------------------------------
    -----:
    : Exception cause:
                                                      :
    :---------------------------------------------------------------
    -----:
    
    java.util.ConcurrentModificationException
    	at
    java.util.ArrayList$Itr.checkForComodification(ArrayList.java:92
    0)
    	at java.util.ArrayList$Itr.remove(ArrayList.java:884)
    	at
    com.ibm.mq.MQXRService.MQTTServerSession.sendWillMessages(MQTTSe
    rverSession.java:1144)
    	at
    com.ibm.mq.MQXRService.MQTTServerContext.close(MQTTServerContext
    .java:222)
    	at
    com.ibm.mq.communications.NonBlockingConnection.closeFinal(NonBl
    ockingConnection.java:667)
    	at
    com.ibm.mq.communications.NonBlockingConnection.sendRemainder(No
    nBlockingConnection.java:338)
    	at
    com.ibm.mq.communications.NonBlockingWorker.run(NonBlockingWorke
    r.java:402)
    	at java.lang.Thread.run(Thread.java:812)
    

Local fix

Problem summary

  • ****************************************************************
    USERS AFFECTED:
    This issue affects users of the IBM MQ Telemetry server who have
    large numbers of MQTT clients disconnecting at the same time.
    
    
    Platforms affected:
    AIX, Windows, Linux on zSeries, Linux on x86-64
    
    ****************************************************************
    PROBLEM DESCRIPTION:
    When an MQTT client connects to the MQ Telemetry server, it can
    provide a Last Will and Testament (LWT) message that will be
    published to a specified topic in the event that the MQTT client
    disconnects unexpectedly.
    
    The MQ Telemetry server maintains lists of the LWT messages that
    need to be published. Periodically, internal worker threads
    within the MQ Telemetry server will process the lists, and
    publish the LWT messages.
    
    If a large number of MQTT clients that had specified LWT
    messages were disconnecting at the same time, then multiple
    worker threads within the MQ Telemetry server would be
    processing the lists of LWT messages at the same time. If two or
    more threads tried to process the same list at the same time, a
    ConcurrentModificationException to occur. When this happened,
    the MQ Telemetry channel stopped and the MQ Telemetry server
    generated an FDC containing Probe Identifier XR014005 and the
    "Exception cause" shown below:
    
    :---------------------------------------------------------------
    -----:
    : Exception cause:
                                                      :
    :---------------------------------------------------------------
    -----:
    
    java.util.ConcurrentModificationException
    	at
    java.util.ArrayList$Itr.checkForComodification(ArrayList.java:92
    0)
    	at java.util.ArrayList$Itr.remove(ArrayList.java:884)
    	at
    com.ibm.mq.MQXRService.MQTTServerSession.sendWillMessages(MQTTSe
    rverSession.java:1144)
    	at
    com.ibm.mq.MQXRService.MQTTServerContext.close(MQTTServerContext
    .java:222)
    	at
    com.ibm.mq.communications.NonBlockingConnection.closeFinal(NonBl
    ockingConnection.java:667)
    	at
    com.ibm.mq.communications.NonBlockingConnection.sendRemainder(No
    nBlockingConnection.java:338)
    	at
    com.ibm.mq.communications.NonBlockingWorker.run(NonBlockingWorke
    r.java:402)
    	at java.lang.Thread.run(Thread.java:812)
    

Problem conclusion

  • To resolve this issue, the MQ Telemetry server has been updated
    to ensure that the lists of Last Will and Testament (LWT) are
    thread safe. This prevents any ConcurrentModificationExceptions
    that can occur if multiple threads access the lists at the same
    time.
    
    ---------------------------------------------------------------
    The fix is targeted for delivery in the following PTFs:
    
    Version    Maintenance Level
    v9.0 LTS   9.0.0.9
    v9.1 CD    9.1.5
    v9.1 LTS   9.1.0.5
    
    The latest available maintenance can be obtained from
    'WebSphere MQ Recommended Fixes'
    http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006037
    
    If the maintenance level is not yet available information on
    its planned availability can be found in 'WebSphere MQ
    Planned Maintenance Release Dates'
    http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006309
    ---------------------------------------------------------------
    

Temporary fix

Comments

APAR Information

  • APAR number

    IT30677

  • Reported component name

    IBM MQ BASE M/P

  • Reported component ID

    5724H7261

  • Reported release

    900

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2019-10-23

  • Closed date

    2020-01-13

  • Last modified date

    2020-01-13

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    IBM MQ BASE M/P

  • Fixed component ID

    5724H7261

Applicable component levels

[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SSYHRD","label":"IBM MQ"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"9.0","Edition":"","Line of Business":{"code":"LOB36","label":"IBM Automation"}}]

Document Information

Modified date:
13 January 2020