Cassandra cannot satisfy consistency level
The Cassandra cluster might not be able to satisfy the configured consistency level because the replication factor is incorrectly configured on one or more nodes. You can recover from this situation by diagnosing the problem with the messages.log file and verifying that the replication factor is correctly configured for all key spaces on each Cassandra node.
Symptoms
Any action in the Global Mailbox management tool that involves mailboxes or messages results in the following message:A system error has occurred. Please contact your system administrator.
Causes
Cassandra might not be able to satisfy the configured level of consistency for the following list of reasons:- Replication configuration issue
- The replication properties are incorrectly configured for one or more Cassandra nodes.
- Cannot maintain quorum
- An issue with Cassandra prevents queries from satisfying
EACH_QUORUM
.Restriction: It may be possible to satisfyLOCAL_QUORUM
on the same node. An event is raised because there is an issue with consistency in general for this node. - Network connectivity issue
- Network problems are preventing Cassandra from communicating with enough nodes to successfully run a query.
Environment
Windows and Linux®.Diagnosing the problem
If you suspect that Cassandra is unable to satisfy the specified consistency level, you can
search the messages.log file for the appropriate error message at the time of
the failure:
- Go to the <install_directory>/usr/servers/defaultServer/logs directory.
- Open the messages.log file.
- Examine the log for events with the error ID CBXMD0040E that are followed by an
event with the following
message:
W Could not execute query successfully due to lack of required Cassandra replicas. '{0}' replicas were required to successfully execute a query using a consistency level of '{1}' within keyspace '{2}', but only '{3}' replicas could be contacted.
The following list includes the information that is provided by the'{0}', '{1}', '{2}', and '{3}' variables:- '{0}' indicates the configured number of Cassandra replicas that are required to successfully run a query.
- '{1}' indicates the configured consistency level.
- '{2}' identifies the specific keyspace that contains the configuration
information that is used by the failed query. Keyspaces contain the replication configuration
information that is used by each type of query that is performed on the Global Mailbox system. Replication settings must be
correctly configured for the following list of keyspaces:
scheduler
mailbox
event
replication
gatekeeper
- '{3}' identifies the number of replicas that were successfully contacted for the query.
The format of the error message is as follows:[time stamp] [thread ID] [logging class] [logging level] [error ID]: [error message]
The following example events include the information that is logged by the messages.log file:[mm/dd/yy hh:mm:ss:ms PDT] 00000063 com.ibm.mailbox.database.dao.cassandra.CassandraDAO E CBXMD0040E: An error has occurred while trying to connect to Cassandra. [mm/dd/yy hh:mm:ss:ms PDT] 00000063 com.ibm.mailbox.database.dao.cassandra.CassandraDAO W Could not execute query successfully due to lack of required Cassandra replicas. 3 replicas were required to successfully execute a query using a consistency level of 'ALL' within keyspace 'UNDEFINED', but only 2 replicas could be contacted.
Resolving the problem
If any of the events in the messages.log file indicate that Cassandra cannot
satisfy the specified consistency level, verify that your Cassandra cluster is correctly configured:
- Collect information that defines the topology of your Cassandra cluster deployment: Tip: If you do not have records that specify your Cassandra cluster topology, you can use the
nodetool
program to determine the number of Cassandra nodes in your Global Mailbox system:- To run nodetool, JAVA_HOME must be set to the location of IBM JDK 8.
- From the command line, run
bin/nodetool
.The output of the
nodetool
command is represented in the following example:Datacenter: datacenter1 ======================= Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 9.23.16.186 55.68 KB 256 49.1% 86282e2e-e4a2-4643-a077-0ca6ea32e138 rac1 UN 9.23.16.184 41.22 KB 256 47.9% 5ca91d43-9154-4b22-b1bb-4b432d0bdf43 rac1 Datacenter: datacenter2 ======================= Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 9.23.25.148 45.19 KB 256 49.2% 215a6b13-fcc2-47ce-bf4d-cd7bfa8fc52c rac1 UN 9.23.16.187 71.97 KB 256 53.7% db01c093-96a8-4ee3-8a42-ede9a4ef41b5 rac1
- To determine the total number of Cassandra nodes that are in your cluster, check the output for
Address
. TheAddress
output provides the IP address of each Cassandra node in the cluster. - Record the number of Cassandra nodes that each data center contains.
- Verify that the replication factor value is correctly configured for each Cassandra node. You
can use a Cassandra Query Language (CQL) shell to identify the replication configuration for each
keyspace for each node.
- From the command line, type
cqlsh> SELECT * FROM system_schema.keyspaces;
The following example is an output of the CQL shell query:keyspace_name durable_writes strategy_class strategy_options scheduler True org.apache.cassandra.locator.NetworkTopologyStrategy {“datacenter1”:”2”,”datacenter2”:”2”} mailbox True org.apache.cassandra.locator.NetworkTopologyStrategy {“datacenter1”:”2”,”datacenter2”:”2”} event True org.apache.cassandra.locator.NetworkTopologyStrategy {“datacenter1”:”2”,”datacenter2”:”2”} replication True org.apache.cassandra.locator.NetworkTopologyStrategy {“datacenter1”:”2”,”datacenter2”:”2”} system True org.apache.cassandra.locator.LocalStrategy {} system_traces True org.apache.cassandra.locator.SimpleStrategy {“replication_factor”:”2”} gatekeeper True org.apache.cassandra.locator.NetworkTopologyStrategy {“datacenter1”:”2”,”datacenter2”:”2”}
- Ensure that replication factor is correctly configured for the following keyspaces:
scheduler
mailbox
event
replication
gatekeeper
- For data center 1, the replication factor is 2
- For data center 2, the replication factor is 3
- Optional: If the replication factor value is incorrect for a keyspace, you can update the
replication factor configuration:
- From the command line, type
./stopGM.sh
to stop all nodes that are running the Global Mailbox application. - Run the
ALTER KEYSPACE
command to update the replication factor configuration for a keyspace. Provide the correct replication factor value for a data center, or data centers that are not correctly configured. The following example shows the correct syntax that is required to update the configuration of datacenter 1 example with theALTER KEYSPACE
command:cqlsh> ALTER KEYSPACE scheduler WITH replication = {'class': 'NetworkTopologyStrategy', 'datacenter1 example' : 2};
Important: Run theALTER KEYSPACE
command for each keyspace that is incorrectly configured. - Run the CQL shell on each Cassandra node in your Global Mailbox system to update the keyspaces that are incorrectly configured. The keyspaces on each Cassandra node must be configured with the correct replication factor.
- Run the
nodetool repair
command on one online Cassandra node after all Cassandra nodes are correctly configured. To run nodetool, JAVA_HOME must be set to the location of IBM JDK 8. - Type
./startGM.sh
to start all nodes that are running the Global Mailbox application.
- From the command line, type
- From the command line, type
- If your Cassandra cluster is correctly configured, check the status of the network for your Global Mailbox system.