Cassandra quorum causing a data center failure

Cassandra nodes are down, resulting in a failed data center.

Symptoms

Less than two Cassandra nodes are operating in a data center.
Important: Each data center requires at least two Cassandra nodes to be operational.

Files cannot be uploaded or processed in the failed data center.

Causes

Cassandra node failures can result from incidences such as network outages and hardware failures.

Environment

Windows, AIX, or Linux.

Diagnosing the problem

  1. Check IBM® Control Center for a red line connecting a Global Mailbox server to the Cassandra service within the Data center view.
  2. Check the Server component - Cassandra within IBM Control Center for a status of Down.
  3. Check for the following message in messages.log on the Global Mailbox Admin and in MEGLogging.log and globalmailbox.log on Sterling B2B Integrator:

    Caused by: com.datastax.driver.core.exceptions.UnavailableException: Not enough replicas available for query at consistency EACH_QUORUM.

Resolving the problem

No files can be processed in the failed data center, so traffic must be rerouted to the surviving data center until the problem is corrected. Refer to the following topics for information about how to recover from a Cassandra failure with minimal impact on your business operations.