IBM Support

IV97415: CAA:SLOW GOSSIP TRANSMISSION ON BOOT MAY CAUSE PARTIONED CLUSTERAPPLIES TO AIX 7200-00

A fix is available

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • After rebooting, one of the cluster nodes was not able
    to join  the cluster environment.
    - lscluster from each node did not show the other node as
      UP.
    - After system reboot, the syslog.caa log file showed
      a delay of over 2 minutes in getting the first
    multicast
      gossip packet.
      When this delay occurs, the node creates its own
    cluster
      ignoring the other node which is already up.
      This leads to a split-brain / partitioned cluster
      in the CAA environment.
    

Local fix

  • n/a
    

Problem summary

  • If a node is rebooted and, due to network issues, fails to
    receive a gossip from other UP nodes within twice of
    node_down_delay, could join by itself, causing a split-brain
    (partitioned cluster).
    

Problem conclusion

  • There is a gate in which all initial clusterwide lock requests
    should consider the count of nodes heartbeating to the
    repository in addition to those gossiping over network.
    There was a hole in the gate and the fix closes it.
    

Temporary fix

Comments

APAR Information

  • APAR number

    IV97415

  • Reported component name

    AIX V7.2

  • Reported component ID

    5765CD200

  • Reported release

    720

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Submitted date

    2017-06-22

  • Closed date

    2017-06-22

  • Last modified date

    2017-09-25

  • APAR is sysrouted FROM one or more of the following:

    IV97148

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    AIX V7.2

  • Fixed component ID

    5765CD200

Applicable component levels

  • R720 PSY U874043

       UP17/09/21 I 1000

PTF to Fileset Mapping

[{"Business Unit":{"code":"BU070","label":"IBM Infrastructure"},"Product":{"code":"SSVEF8","label":"AIX 7.2 Enterprise Edition"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"720","Edition":"","Line of Business":{"code":"LOB68","label":"Power HW"}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SG11S","label":"AIX 7.2 HIPERS, APARs and Fixes"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"720","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}}]

Document Information

Modified date:
10 September 2025