Heartbeating over TCP/IP and storage area networks

A heartbeat is a type of a communication packet that is sent between nodes. Heartbeats are used to monitor the health of the nodes, networks and network interfaces, and to prevent cluster partitioning.

In order for a PowerHA® SystemMirror® cluster to recognize and respond to failures, it must continually check the health of the cluster. Some of these checks are provided by the heartbeat function.

Each cluster node sends heartbeat messages at specific intervals to other cluster nodes, and expects to receive heartbeat messages from the nodes at specific intervals. If messages stop being received, PowerHA SystemMirror recognizes that a failure has occurred. Heartbeats can be sent over:

  • TCP/IP networks
  • Storage Area Networks
  • Cluster repository disk

Cluster Aware AIX® (CAA) uses heartbeat communication on all available TCP/IP networks and storage area networks (SAN). If TCP/IP networks and SAN networks fail, CAA attempts to use the repository disk as an alternative heartbeat mechanism. The heartbeat path for the backup repository disk is displayed as a dpcom interface in the output of the lscluster command. If TCP/IP networks and SAN networks and working, the lscluster -i command displays the dpcom interface as restricted.

The heartbeat function is configured to use specific paths between nodes. This allows heartbeats to monitor the health of all PowerHA SystemMirror networks and network interfaces, as well as the cluster nodes themselves.

The heartbeat paths are set up automatically by CAA; you have the option to configure point-to-point and disk paths as part of PowerHA SystemMirror configuration.