Configuring failure detection in Cassandra
You must configure failure detection in Cassandra, to avoid temporary period of unavailability due to unexpected failures or network errors.
When a Cassandra node fails unexpectedly or a networking error causes a particular Cassandra node to become unreachable, you might observe a temporary period of unavailability. This period of unavailability results from Cassandra attempting to contact a node that it does not yet know has failed or is unreachable.
To shorten the period of unavailability, you must set appropriate value for the
phi_convict_threshold
property in the cassandra.yaml file.
Configuring the property adjusts the sensitivity of the failure detector.
Default value for phi_convict_threshold
is 8. Lower values increase the chance
that an unresponsive node is marked as down, while higher values decrease the chance that transient
failures cause a node failure. As per Cassandra documentation, a value lower than 5 or higher than
12 is not recommended.
To configure phi_convict_threshold
property for failure detection in
Cassandra:
- Log in to the server where Cassandra node is installed.
- To stop the Cassandra node, go to <install_dir>/MailboxUtilities/bin, and type ./stopGMData.sh.
- Go to <install_dir>/apache-cassandra/conf directory.
- Open the cassandra.yaml file.
-
Change value for the
phi_convict_threshold
property as required. - Restart the Cassandra node for the change to take effect. Go to <install_dir>/MailboxUtilities/bin, and type ./startGMData.sh.
- Repeat the steps on all Cassandra nodes.