Repairing Cassandra nodes

Routine maintenance is required to obtain optimal performance from Cassandra. Some maintenance is handled by a scheduled job, while other tasks must be completed manually.

Before you begin

  • The nodetool repair command must be completed regularly to maintain Cassandra nodes. For more information, see Repairing nodes in the Cassandra documentation.
  • Locate nodetool, a binary bundled with Cassandra.
  • Before running the repair, you can optionally throttle compaction throughput. This reduces the performance impact and speeds up the repair process. For more information, see Throttling compaction throughput.

About this task

The nodetool repair command can be used to run an incremental repair or a full repair. To help prevent cluster issues, use the -local option to restrict the repair to the local data center. A typical schedule is for the scheduled job to run an incremental repair daily and a full repair be completed manually one time a week. This can vary depending on your data and performance.

You can also choose to run sequential or parallel (default) repair.

Sequential repair
Sequential repair repairs nodes one after the other. This is time consuming.
Parallel repair
Parallel repair repairs each node at the same time. This improves the repair time. Although faster, parallel repair uses much more system resources.

Avoid running a repair when the system is under heavy load. The repair reduces Global Mailbox throughput. Parallel repair reduces throughput more than sequential repair.

To run the nodetool repair command:

Procedure

  1. Log in to the server where a Cassandra node is installed.
  2. Go to the <install_dir>/apache-cassandra/bin directory.
  3. Type ./nodetool repair -local to run an incremental, parallel repair. Add -full -local to run a full repair. Add -seq to run a sequential repair. Examples:
    ./nodetool repair -seq -local
    Runs an incremental, sequential repair.
    ./nodetool repair -full -local
    Runs a full, parallel repair.
    ./nodetool repair -full -local -seq
    Runs a full, sequential repair.
    Tip: To run a repair routinely in Linux, you can schedule the repair operation to run as a cron job. For information about creating cron jobs, see the documentation for your operating system.