Cluster partitioning

Partitioning, also called node isolation, occurs when a network or network interface controller (NIC) failure isolates cluster nodes from each other.

When a PowerHA® SystemMirror® node stops receiving network traffic from another node, it assumes that the other node has failed. Depending on your PowerHA SystemMirror configuration, the node might begin acquiring disks from the failed node and making applications and IP labels available. If the failed node is actually still up, data corruption might occur when the disks are taken from it. If the network becomes available again, PowerHA SystemMirror stops one of the nodes to prevent further disk contention and duplicate IP addresses on the network.

PowerHA SystemMirror heartbeat mechanism relies on the IP subsystem and the network infrastructure. Therefore, if the network is congested or a node is congested, the IP subsystem can silently discard the heartbeats. Attempts are made to adjust monitoring characteristics to take network congestion into account and prevent cluster partitioning.