Transactions must be compacted or the speed of transactions decreases. You must routinely
perform major compactions in Cassandra to maintain Global Mailbox performance.
Before you begin
- Locate nodetool, a binary bundled with Cassandra.
- Verify that JAVA_HOME is set to the location of IBM JDK 8.
About this task
When possible, rely on minor compactions instead of major compactions to address
performance concerns. Minor compactions are triggered automatically when you perform a
flush. Schedule compactions often enough such that operations are optimized and
compactions are not overlapping. Schedule compactions in your Cassandra instances according to your
business requirements and transaction characteristics.
To perform a major compaction:
Procedure
-
Monitor the average transactions time to gauge how often to perform major compactions.
-
Run nodetool, with the following command: nodetool --host
<hostname> compact
By default, host connects to the local Cassandra instance.
-
Run this command against each Cassandra node individually.
Important: Only one compaction can be performed at a time. Attempting to
execute multiple compactions simultaneously results in compaction failures. This causes compactions
to take longer.
Tip: Doing compactions frequently, to keep the number of tombstones low or empty, does
not result in the fastest compaction time. A major compaction consolidates all existing SSTables
into a single SSTable. During compaction, there is a temporary spike in disk space usage and disk
I/O because the old and new SSTables co-exist. A major compaction can cause considerable disk
I/O.