Configuring the tombstone purge interval

In some environments, the number of tombstones created over a 10-day period may cause performance degradation. You can adjust the time that elapses before a record in Cassandra is eligible for tombstone purge.

About this task

Global Mailbox uses Apache Cassandra as its database. Cassandra is a distributed, peer-oriented (that is, no master) database that uses timestamped tombstones to mark deleted records. By default, tombstones live in the database for 10 days before they are eligible for purging by the Cassandra compaction tool.

If your environment has unusual circumstances, the number of tombstones created over a 10-day period might cause performance degradation. To reduce the impact of too many database tombstones on Global Mailbox, the tombstone purge interval, also known as the garbage collection interval, can be shortened.

The parameter, gc_grace_seconds, defines the minimum amount of time that must elapse before a record is eligible for tombstone purge. The parameter must be configured separately for each database table in a Cassandra schema.

To configure the tombstone purge interval gc_grace_seconds:

Procedure

  1. Log in to the server where a Cassandra node is installed.
  2. Go to the <install_dir>/apache-cassandra/bin directory.
  3. Type ./cqlsh <host name/IP address>.
    Replace <host name/IP address> with appropriate value for the server where the Cassandra node is installed.
  4. At the cqlsh prompt, type DESC TABLES
  5. For each table listed, run the command, ALTER <keyspace_name>.<table_name> WITH gc_grace_seconds = <newTimeInSec>;
    Replace <keyspace_name> with the keyspace in which the <table_name> is located, and <newTimeInSec> with the wanted integer value for the parameter.
    Tip: The garbage collection grace period is specified in seconds. To view the value for gc_grace_seconds for each table, type DESC TABLE <keyspaces_name>.<table_name>.
    Important: The default value for gc_grace_seconds is 10 days. A shorter grace period comes with risk. If there is a data center or network outage, tombstones created and garbage collected in one data center during the outage might not be replicated to the other data center. Those deleted records might reappear after service is re-established. Unless there are unusual circumstances in your environment that require it, do not set gc_grace_seconds to less than 5 days on any table. Also, the gc_grace_seconds setting affects expiration of the hints generated for hinted handoff, so it is dangerous to reduce gc_grace_seconds below the duration of the hinted handoff window which is 3 hours by default. The Cassandra compaction tool is dependent upon the read repair tool. A read repair must be done before a compaction and requires system resources that can affect the operational performance of Global Mailbox.