Quorum devices support on Pacemaker

A quorum device helps a cluster manager make cluster management decisions when the cluster manager's normal decision process does not produce a clear choice.

To select an action to take, a cluster manager counts the number of cluster domain nodes that support each of the potential actions. The cluster manager then selects the action that is supported by most of the cluster domain nodes. If the same number of cluster domain nodes supports more than one choice, then the cluster manager refers to a quorum device to make the choice.

Important: In Db2® 11.5.8 and later, Mutual Failover high availability is supported when using Pacemaker as the integrated cluster manager. In Db2 11.5.6 and later, the Pacemaker cluster manager for automated fail-over to HADR standby databases is packaged and installed with Db2. In Db2 11.5.5, Pacemaker is included and available for production environments. In Db2 11.5.4, Pacemaker is included as a technology preview only, for development, test, and proof-of-concept environments.
The Pacemaker cluster stack supports the following quorum devices:
  • Two-node quorum (default)
  • QDevice quorum
  • Majority quorum

Two-node quorum

The two-node quorum is the default mechanism. Because no tie-breaker mechanism exists, the two-node quorum is prone to the split-brain scenario. It is not intended for production environments.

QDevice quorum

QDevice is similar to a Network IP tiebreaker in that it requires an external resource to be accessible by all hosts in the current Pacemaker cluster. They differ in terms of reliability: QDevice is much more reliable because the quorum decision logic is more robust than a simple TCP/IP ping to the external IP address. QDevice requires that the resource is placed on a separate host, similar to a majority quorum requirement. However, setup is simplified, as the host with the QDevice does not need to be part of the Pacemaker quorum. The QDevice is the best blended quorum solution that combines reliability and simplicity.
Note: The cluster can have only one QNet server acting as the arbitrator.
Figure 1. An overview of the key components in a Pacemaker cluster that uses QDevice quorum
An overview of the key components in a Pacemaker cluster that uses QDevice quorum
Consider the following information about QDevices:
  • A quorum device acts as the third-party arbitration device for the cluster. Its primary use is to allow a cluster to sustain more node failures than standard quorum rules allow.

  • As seen in Figure 1, it is a Corosync daemon process, QDevice daemon (corosync-qdevice), running on each node in the cluster. The QDevice daemon provides a configured number of votes to the quorum subsystem, based on a third-party arbitrator's decision. This third-party arbitrator is a separate Corosync QNet daemon (corosync-qnetd) running on a separate host (not part of the cluster). The third-party arbitrator contributes to the deciding vote of the corosync-qdevice logic that ultimately decides the surviving side in a split-brain scenario.

  • Both the QDevice daemon and the QNet daemon are provided with different software packages and must be installed separately. The QDevice daemon must be installed on each host in the cluster (Host1 and Host2 in Figure 1). The QNet daemon is needed only on a separate host that is not part of the cluster (Host3 in Figure 1).

  • The QNet daemon can be used as the arbitrator for another cluster (in Figure 1, two clusters share the same host with qnetd process), given that all clusters have a unique name.

  • The arbitrator host running the QNet daemon does not need to be running the same operating system, or hardware architecture, as the Pacemaker cluster running the QDevice daemon.
Figure 2. A two-node HADR configuration with Qdevice configured.
A two-node HADR configuration with Qdevice configured
Table 1. Scenarios and expected behaviors in a two-node HADR configuration with Qdevice configured.
Link Failures Host1 instance state Host1 Quorum status Host2 instance state Host 2 Quroum status Host3 state Scenario description
1 Up Acq Down Lost Up Quorum device picks the node with the lowest node ID value to survive, which is Host 1. Host 2 loses quorum and the instance is taken down.
2 Up Acq Up Acq Up Host 1 still has quorum and continues to work.
3 Up Acq Up Acq Up Host 2 still has quorum and continues to work.
1 and 2 Down Acq Up Acq Up Host 1 loses quorum and the instance is taken down.
2 and 3 Up Acq Up Acq Down Host 1 and host 2 maintain quorum but are without the quorum service.
1 and 3 Up Acq Down Lost Up Host 2 loses quorum and the instance is taken down.
All Down Lost Down Lost Down Both hosts lose quorum and the instances are taken down. The quorum service is also down.
For information on the requirements of the third arbitrator host, see Prerequisites for an integrated solution using Pacemaker.

Majority quorum

The Majority Quorum avoids split-brain scenarios by adding a third node to the cluster for arbitration. In a split-brain scenario, the side that successfully acquires the third node is the surviving side. The difference between QDevice and Majority Quorum is that the third node is fully integrated into the cluster.

SBD fencing

For Mutual Failover clustered environments, SBD fencing is configured along with Qdevice. If a quorum loss occurs, the host with a quorum loss is fenced (rebooted) by the watchdog that is configured in /dev/watchdog. If it is not already configured on your system, the /dev/watchdog is configured as part of your Db2 installation.
Note: Qdevice is required for production-level Mutual Failover environments.

Quorum consideration on public cloud vendors

For more information, refer to Public cloud vendors supported with Db2 Pacemaker.

Table 2. The advantages and disadvantages of each quorum type
Quorum type Advantages Disadvantages
Two Node
  • Simplest setup.
  • No additional hardware or software configuration.
  • Potential split-brain scenario, leading to dual primary phenomenon.
QDevice
  • More reliable than Two-Node quorum.
  • No need to include the third host as part of the cluster.
  • No need to include the full Pacemaker cluster software stack on the third host. Only one Corosync RPM is needed.
  • Requires a TCP/IP accessible host from the primary and standby hosts.
Majority
  • More reliable than Two-Node quorum.
  • Need to include the full Pacemaker cluster stack that is installed and configured on the third host.
Based on the advantages and disadvantages that are shown, the QDevice quorum is a capable quorum mechanism for Db2.
Note: Majority Quorum is not supported in the current release. It is being considered for a future release.