IBM Support

IT38764: RDQM configured for both HA and DR is unable to run on any nodeafter frequent connection interruptions

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

 

APAR status

  • Closed as program error.

Error description

  • After some frequent queue manager restarts caused due to network
    interruption the replicated data queue manager (RDQM) goes into
    failed state and does not run on any node.
    The DRBD status of the queue manager that fails to start on the
    Primary node will look like this.
    
    qm1 role:Primary
      disk:UpToDate
      vm05.ibm.com role:Secondary
        peer-disk:UpToDate
      mqhavm06.hursley.ibm.com role:Secondary
        peer-disk:UpToDate
    
    qm1.dr role:Secondary
      disk:Diskless quorum:no
      _remote connection:Connecting
    

Local fix

  • Suspend the primary node then resume it.
    
    rdqmadm -s
    rdqmadm -r
    

Problem summary

  • ****************************************************************
    USERS AFFECTED:
    All MQ users who have configured RDQM for high availability (HA)
    and disaster recovery (DR).
    
    
    Platforms affected:
    Linux on x86-64
    
    ****************************************************************
    PROBLEM DESCRIPTION:
    The repeated loss and regain of quorum which results in queue
    manager resource restarts caused the DRBD disk state of the DR
    resource to become stuck in diskless state and did not allow the
    queue manager to be started on the HA Primary node.
    

Problem conclusion

  • The code has been modified to recover automatically from the
    diskless state in the above situation.
    
    ---------------------------------------------------------------
    The fix is targeted for delivery in the following PTFs:
    
    Version    Maintenance Level
    v9.2 LTS   9.2.0.5
    v9.x CD    9.2.5
    
    The latest available maintenance can be obtained from
    'WebSphere MQ Recommended Fixes'
    http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006037
    
    If the maintenance level is not yet available information on
    its planned availability can be found in 'WebSphere MQ
    Planned Maintenance Release Dates'
    http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006309
    ---------------------------------------------------------------
    

Temporary fix

Comments

APAR Information

  • APAR number

    IT38764

  • Reported component name

    MQ BASE V9.2

  • Reported component ID

    5724H7281

  • Reported release

    920

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2021-10-21

  • Closed date

    2021-12-22

  • Last modified date

    2022-01-21

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    MQ BASE V9.2

  • Fixed component ID

    5724H7281

Applicable component levels

[{"Line of Business":{"code":"LOB45","label":"Automation"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSYHRD","label":"IBM MQ"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"920"}]

Document Information

Modified date:
22 January 2022