Quorum
Quorum uses a voting system to decide how many guests can fail before the cluster becomes non-operational .
A simple example
The following figure shows the basic concept of Quorum. In this example, there are three guests set up in a high availability (HA) cluster:
For example, if an operator wants to move the workload from Guest 1 to Guest 2, at least 2 out of 3 guests have to agree to this operation when the move operation is triggered (cluster state is quorate). That also means as soon as you lose Quorum , it is no longer possible to perform operations (cluster state is unquorate).
A more complex example
If you use two CECs instead of one CEC, you can easily get into a split brain situation. The split brain situation means that the cluster can split into two (or more) smaller clusters which are all quorate. With two CECs you get this problem no matter how many nodes you are using. Look at the following examples with two and three nodes:
To solve this issue, a third system is introduced, which does not run any cluster resources and can only vote to make the cluster quorate. Look at the following example:
A common problem here is that a guest outside of both CECs might not be available for every environment. The the guest should also be reachable through a different network. As both CECs in this scenario are on the same site, a viable approach is the use of shared storage as a Quorum member. This is, however, currently not implemented.
To list both options:
-
Separate system (arbitrator): A separate system that is reachable over two different networks.
-
Shared storage (qdisk): This is not implemented yet with the current version of pacemaker but was possible with older versions and is a common approach with other HA solutions. This is usually more convenient.