IBM Support

IT37633: RDQM does not failover as expected when the Primary node is rebooted or when the node loses connection to its peers

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

 

APAR status

  • Closed as program error.

Error description

  • When failing over a queue manager operating within an RDQM HA
    cluster, the failover operation fails if the queue manager file
    system can not be unmounted in timely manner and the queue
    manager remains in stopped state on all nodes.
    

Local fix

Problem summary

  • ****************************************************************
    USERS AFFECTED:
    All IBM MQ users using the RDQM feature.
    
    
    Platforms affected:
    Linux on x86-64
    
    ****************************************************************
    PROBLEM DESCRIPTION:
    As part of failover each of the cluster resources are stopped on
    the primary. The queue manager filesystem is unmounted while the
    file system resource for the queue manager is stopped. If the
    filesystem could not be unmounted then the fuser call was
    invoked to verify which process is holding the file system. If
    the fuser call did not return in timely manner then cluster
    agent marked it as a failure in stopping the file system
    resource. As a result the queue manager failover failed.
    

Problem conclusion

  • The RDQM logic has been modified to use an alternative mechanism
    to verify the usage of the queue manager file system during
    failover and additional diagnostics are added to log information
    of any process that prevents unmounting of the queue manager
    file system.
    
    ---------------------------------------------------------------
    The fix is targeted for delivery in the following PTFs:
    
    Version    Maintenance Level
    v9.2 LTS   9.2.0.7
    v9.x CD    9.2.4
    
    The latest available maintenance can be obtained from
    'WebSphere MQ Recommended Fixes'
    http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006037
    
    If the maintenance level is not yet available information on
    its planned availability can be found in 'WebSphere MQ
    Planned Maintenance Release Dates'
    http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006309
    ---------------------------------------------------------------
    

Temporary fix

Comments

APAR Information

  • APAR number

    IT37633

  • Reported component name

    MQ BASE V9.2

  • Reported component ID

    5724H7281

  • Reported release

    920

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2021-07-14

  • Closed date

    2022-08-23

  • Last modified date

    2022-08-23

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    MQ BASE V9.2

  • Fixed component ID

    5724H7281

Applicable component levels

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSYHRD","label":"IBM MQ"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"920","Line of Business":{"code":"LOB45","label":"Automation"}}]

Document Information

Modified date:
23 August 2022