IBM Support

IT28897: IBM MQ Telemetry service fails with OutOfMemoryError aftermultiple MQTT V3.1.1 client takeovers

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

 

APAR status

  • Closed as program error.

Error description

  • The IBM MQ V9.0 Telemetry service receives lots of connection
    requests, all specifying the same client identifier. After
    running for a while, the IBM MQ Telemetry service fails with a
    java.lang.OutOfMemoryError. A heapdump generated at the time
    of the OutOfMemoryError occurs shows that:
    
    - The majority of the Java heap is taken up with a single
      MQTTServerSessionV311 object.
    - The MQTTServerSessionV311 object contains a number of
      nested MQTTServerSessionV311 objects, as shown in the simple
      diagram below:
    
      MQTTServerSessionV311
        |--> MQTTServerSessionV311
    	  |--> MQTTServerSessionV311
    	    |--> MQTTServerSessionV311
    		  |--> MQTTServerSessionV311
    		  ......
    

Local fix

Problem summary

  • ****************************************************************
    USERS AFFECTED:
    This issue affects users of the IBM MQ Telemetry service who
    have MQTT V3 and/or V3.1.1 client applications which connect to
    an IBM MQ Telemetry channel specifying:
    
    - The same client identifier.
    - And the cleanSession flag set to true.
    
    
    Platforms affected:
    AIX, Linux on Power, Linux on S390, Linux on x86, Linux on
    x86-64, Linux on zSeries, Windows
    
    ****************************************************************
    PROBLEM DESCRIPTION:
    The IBM MQ Telemetry service maintains state information for
    every MQTT V3 and V3.1.1 client application that is currently
    connected to it. If:
    
    - An MQTT client application has connected to the IBM MQ
    Telemetry service, specifying a client identifier.
    - And another MQTT client application connects to the IBM MQ
    Telemetry service specifying the same client identifier.
    
    then a "takeover event" occurs. In this scenario, the IBM MQ
    Telemetry service will disconnect the original client
    application, and then connect the new client application. As
    part of this processing, the IBM MQ Telemetry service:
    
    - Creates a new internal MQTTServerSessionV3 or
    MQTTServerSessionV311 object to store the state for the new
    client application.
    - Update the MQTTServeSessionV3 or MQTTServerSessionV311 object
    with a reference to the object containing the state of the
    original client application.
    
    Here is a simple diagram that shows this for an MQTT V3.1.1
    client application that has taken over by a new MQTT V3.1.1
    client application:
    
      MQTTServerSessionV311 object for the new client application
         |-----> MQTTServerSessionV311 object for the original
    client application.
    
    
    The reference to the MQTTServerSessionV3 or
    MQTTServerSessionV311 object in the object for the new client
    application would only be removed at the end of a "takeover
    event" if:
    
    - The new client application had connecting with the
    cleanSession flag set to false.
    - And the original instance of the client application had also
    connected with the cleanSession flag set to false.
    
    This mean that if:
    
    - Either the original instance of the client application had
    connected with the cleanSession flag set to true.
    - Or the new client application had connected with the
    cleanSession flag set to true.
    
    the reference to the the MQTTServerSessionV3 or
    MQTTServerSessionV311 object for the original client application
    would remain in the object for the new client application at the
    end of a "takeover event".
    
    If multiple "takeover events" occurred for the same client
    identifier, then the Java heap would eventually fill up with
    MQTTServerSessionV3 or MQTTServerSessionV311 objects. Here is a
    simple diagram that shows the objects on the Java heap if 5
    takeover events occurred for MQTT V3.1.1 clients that connected
    using the same client identifier:
    
      MQTTServerSessionV311 object for client application 6
         |-----> MQTTServerSessionV311 object for client application
    5
             |-----> MQTTServerSessionV311 object for client
    application 4
                 |-----> MQTTServerSessionV311 object for client
    application 3
                     |-----> MQTTServerSessionV311 object for client
    application 2
                         |-----> MQTTServerSessionV311 object for
    client application 1
    

Problem conclusion

  • The IBM MQ Telemetry service has been updated to ensure that the
    reference to the MQTTServerSessionV3 or MQTTServerSessionV311
    object for the original client application is always removed
    from the object for a new client application following a
    "takeover event". This prevents a build up of
    MQTTServerSessionV3 and/or MQTTServerSessionV311 objects if
    multiple "takeover events" occur for the same client identifier.
    
    ---------------------------------------------------------------
    The fix is targeted for delivery in the following PTFs:
    
    Version    Maintenance Level
    v9.0 LTS   9.0.0.9
    v9.1 CD    9.1.4
    v9.1 LTS   9.1.0.4
    
    The latest available maintenance can be obtained from
    'WebSphere MQ Recommended Fixes'
    http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006037
    
    If the maintenance level is not yet available information on
    its planned availability can be found in 'WebSphere MQ
    Planned Maintenance Release Dates'
    http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006309
    ---------------------------------------------------------------
    

Temporary fix

Comments

APAR Information

  • APAR number

    IT28897

  • Reported component name

    IBM MQ BASE M/P

  • Reported component ID

    5724H7261

  • Reported release

    900

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2019-04-23

  • Closed date

    2019-10-11

  • Last modified date

    2019-10-11

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    IBM MQ BASE M/P

  • Fixed component ID

    5724H7261

Applicable component levels

[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SSYHRD","label":"IBM MQ"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"9.0","Edition":"","Line of Business":{"code":"LOB36","label":"IBM Automation"}}]

Document Information

Modified date:
11 October 2019