IBM Support

Troubleshooting TWCS sstables not being removed

Troubleshooting


Problem

When using TimeWindowCompactionStrategy (TWCS) with DSE and Cassandra with data that is written with a Time To Live (TTL) you expect that there will not be any sstables older than the TTL period.
Sometimes you may find that these old sstables are not removed. 

This will generally be caused by data having been written without a TTL, and therefore preventing the sstable from being dropped.

How do you find which sstables are causing the problem, and what data in the sstables is to blame?

Cause

TWCS is ideal to use when data is written in a time-series fashion and there is a TTL. When all of the data in an sstable has reached its TTL, the sstable can just be removed and does not need to be compacted. This is far more efficient as the sstable does not need to be read and merged with another sstable.

By default sstables in TWCS tables are checked to see if they only contain fully expired data every 10 minutes. This is controlled by the parameter 'expired_sstable_check_frequency_seconds'.

For a background on how TWCS works there are the following links:
    https://thelastpickle.com/blog/2016/12/08/TWCS-part1.html
    https://docs.datastax.com/en/planning/oss/schema-tuning.html#time-window-compaction-strategy

TWCS writes data into time window buckets. The bucket size defaults to one day. At then end of each time window, all the sstables in that window bucket will be compacted together in a single sstable. Within each bucket the compaction works in the same way a SizeTieredCompactionStrategy (STCS).

If you write all the data with a TTL of 30 days and have a one day bucket size, you should then have 30 sstables plus the sstables in the active bucket window.

So in this example if you start to see sstables older than 30 days, then the oldest sstable likely contains some data that is preventing the sstable from being removed. This will also prevent all the more recent sstables from being removed.

Diagnosing The Problem

In order to find which sstable is causing the problem there are some commands that you can run.

These are sstableexpiredblockers, sstablemetadata and sstabledump.

 

The first one to run is the sstableexpiredblockers command.

This is run on the table and will list the sstables that are blocking others from being removed.

For example (formatted to make it more readable):

$ sstableexpiredblockers twcs_demo metrics_without_default_ttl

[TrieIndexSSTableReader(path='/var/lib/cassandra/data/twcs_demo/metrics_without_default_ttl-86d0d210543811f1b59be5ff2537d160/bb-8-bti-Data.db') (minTS = 1779281247404613, maxTS = 1779281247420479, maxLDT = 2147483647)], 
  blocks 1 expired sstables from getting dropped: 
    [TrieIndexSSTableReader(path='/var/lib/cassandra/data/twcs_demo/metrics_without_default_ttl-86d0d210543811f1b59be5ff2537d160/bb-9-bti-Data.db') (minTS = 1779281259304386, maxTS = 1779281299549392, maxLDT = 1779281599)],

[TrieIndexSSTableReader(path='/var/lib/cassandra/data/twcs_demo/metrics_without_default_ttl-86d0d210543811f1b59be5ff2537d160/bb-6-bti-Data.db') (minTS = 1779273941502975, maxTS = 1779274066714574, maxLDT = 2147483647)], 
  blocks 2 expired sstables from getting dropped: 
    [TrieIndexSSTableReader(path='/var/lib/cassandra/data/twcs_demo/metrics_without_default_ttl-86d0d210543811f1b59be5ff2537d160/bb-9-bti-Data.db') (minTS = 1779281259304386, maxTS = 1779281299549392, maxLDT = 1779281599)], 
    [TrieIndexSSTableReader(path='/var/lib/cassandra/data/twcs_demo/metrics_without_default_ttl-86d0d210543811f1b59be5ff2537d160/bb-7-bti-Data.db') (minTS = 1779281061316262, maxTS = 1779281101577999, maxLDT = 1779281401)],

 

So bb-8-bti-Data.db is preventing bb-9-bti-Data.db, and bb-6-bti-Data.db is preventing bb-7-bti-Data.db and bb-9-bti-Data.db from being removed.

 

You can then run sstablemetadata on bb-6-bti-Data.db and bb-8-bti-Data.db to see what these contain.

For example:

$ sstablemetadata /var/lib/cassandra/data/twcs_demo/metrics_without_default_ttl-86d0d210543811f1b59be5ff2537d160/bb-6-bti-Data.db
SSTable: /var/lib/cassandra/data/twcs_demo/metrics_without_default_ttl-86d0d210543811f1b59be5ff2537d160/bb-6-bti
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Bloom Filter FP chance: 0.01
Minimum timestamp: 1779273941502975 (05/20/2026 10:45:41)
Maximum timestamp: 1779274066714574 (05/20/2026 10:47:46)
SSTable min local deletion time: 1779273941 (05/20/2026 10:45:41)
SSTable max local deletion time: 2147483647 (no tombstones)
Compressor: org.apache.cassandra.io.compress.LZ4Compressor
Compression ratio: 0.47571238742064076
TTL min: 0
TTL max: 300 (5 minutes)
First token: 1644045339670275309 (sensor_002)
Last token: 1810975871054100166 (sensor_003)
covered clusterings: [2026-05-20 10:47Z, 2026-05-20 10:45Z]
Estimated droppable tombstones: 1.5403225806451613
SSTable Level: 0
Repaired at: 0
Pending repair: --
Replay positions covered: {CommitLogPosition(segmentId=1759241510069, position=10629012)=CommitLogPosition(segmentId=1759241510069, position=11458913)}
totalColumnsSet: 240
totalRows: 240
Estimated tombstone drop times: 
   Drop Time                        | Count  (%)  Histogram 
   1779273960 (05/20/2026 10:46:00) |    31 (  8) OOOOOOOOOOO.
   1779274020 (05/20/2026 10:47:00) |    82 ( 21) OOOOOOOOOOOOOOOOOOOOOOOOOOOOOO
   1779274080 (05/20/2026 10:48:00) |    78 ( 20) OOOOOOOOOOOOOOOOOOOOOOOOOOOO.
   1779274260 (05/20/2026 10:51:00) |    31 (  8) OOOOOOOOOOO.
   1779274320 (05/20/2026 10:52:00) |    82 ( 21) OOOOOOOOOOOOOOOOOOOOOOOOOOOOOO
   1779274380 (05/20/2026 10:53:00) |    78 ( 20) OOOOOOOOOOOOOOOOOOOOOOOOOOOO.
   Percentiles
   50th      1996099046 (04/02/2033 23:57:26)
   75th      1996099046 (04/02/2033 23:57:26)
   95th      1996099046 (04/02/2033 23:57:26)
   98th      1996099046 (04/02/2033 23:57:26)
   99th      1996099046 (04/02/2033 23:57:26)
   Min       1663415873 (09/17/2022 11:57:53)
   Max       1996099046 (04/02/2033 23:57:26)
Partition Size: 
   Size (bytes)  | Count  (%)  Histogram 
   3973 (3.9 kB) |     2 (100) OOOOOOOOOOOOOOOOOOOOOOOOOOOOOO
   Percentiles
   50th      3973 (3.9 kB)
   75th      3973 (3.9 kB)
   95th      3973 (3.9 kB)
   98th      3973 (3.9 kB)
   99th      3973 (3.9 kB)
   Min       3312 (3.2 kB)
   Max       3973 (3.9 kB)
Column Count: 
   Columns | Count  (%)  Histogram 
   124     |     2 (100) OOOOOOOOOOOOOOOOOOOOOOOOOOOOOO
   Percentiles
   50th      124
   75th      124
   95th      124
   98th      124
   99th      124
   Min       104
   Max       124
Estimated cardinality: 2
EncodingStats minTTL: 0
EncodingStats minLocalDeletionTime: 1779274241 (05/20/2026 10:50:41)
EncodingStats minTimestamp: 1779273941502975 (05/20/2026 10:45:41)
KeyType: org.apache.cassandra.db.marshal.UTF8Type
ClusteringTypes: [org.apache.cassandra.db.marshal.ReversedType(org.apache.cassandra.db.marshal.TimestampType)]
StaticColumns: 

RegularColumns: value:org.apache.cassandra.db.marshal.DoubleType

 

The main items to look at in the output are:

SSTable: /var/lib/cassandra/data/twcs_demo/metrics_without_default_ttl-86d0d210543811f1b59be5ff2537d160/bb-6-bti
Minimum timestamp: 1779273941502975 (05/20/2026 10:45:41)
Maximum timestamp: 1779274066714574 (05/20/2026 10:47:46)
SSTable min local deletion time: 1779273941 (05/20/2026 10:45:41)
SSTable max local deletion time: 2147483647 (no tombstones)
TTL min: 0
TTL max: 300 (5 minutes)
First token: 1644045339670275309 (sensor_002)
Last token: 1810975871054100166 (sensor_003)
covered clusterings: [2026-05-20 10:47Z, 2026-05-20 10:45Z]
Estimated droppable tombstones: 1.5403225806451613
Estimated cardinality: 2
EncodingStats minTTL: 0
EncodingStats minLocalDeletionTime: 1779274241 (05/20/2026 10:50:41)
EncodingStats minTimestamp: 1779273941502975 (05/20/2026 10:45:41)

 

This compares to the output of one of the sstables that is being blocked from being removed:

SSTable: /var/lib/cassandra/data/twcs_demo/metrics_without_default_ttl-86d0d210543811f1b59be5ff2537d160/bb-7-bti
Minimum timestamp: 1779281061316262 (05/20/2026 12:44:21)
Maximum timestamp: 1779281101577999 (05/20/2026 12:45:01)
SSTable min local deletion time: 1779281361 (05/20/2026 12:49:21)
SSTable max local deletion time: 1779281401 (05/20/2026 12:50:01)
TTL min: 300 (5 minutes)
TTL max: 300 (5 minutes)
First token: 1644045339670275309 (sensor_002)
Last token: 1810975871054100166 (sensor_003)
covered clusterings: [2026-05-20 12:45Z, 2026-05-20 12:44Z]
Estimated droppable tombstones: 2.0
Estimated cardinality: 2
EncodingStats minTTL: 300 (5 minutes)
EncodingStats minLocalDeletionTime: 1779281361 (05/20/2026 12:49:21)
EncodingStats minTimestamp: 1779281061316262 (05/20/2026 12:44:21)

 

Both sstables contain the same first and last token.

 

So bb-7-bti has both the "TTL min" and "TTL max" set to 300, so all data has a TTL. The "Estimated droppable tombstones" and "Estimated cardinality" values are the same so all records are expired and so the sstable should be able to be removed.

Whereas bb-6-bti has "TTL min set" to 0 and "TTL max set" to 300, so some rows do not have a TTL (TTL is 0). The "Estimated droppable tombstones" and "Estimated cardinality" values are different so there are some rows that cannot be dropped. The "SSTable max local deletion time: 2147483647 (no tombstones)" also shows that the are rows without a TTL.

 

sstabledump can then be run to see what data is the cause of the issue. This will dump out the whole contents of the sstable and so the output is likely to be very large. This will output your data and should not be sent into IBM DataStax support if this does not contain only test data.

For example:

sstabledump /var/lib/cassandra/data/twcs_demo/metrics_without_default_ttl-86d0d210543811f1b59be5ff2537d160/bb-6-bti-Data.db
[
  {
    "partition" : {
      "key" : [ "sensor_002" ],
      "position" : 0
    },
    "rows" : [
...
      {
        "type" : "row",
        "position" : 256,
        "clustering" : [ "2026-05-20 10:47:47.668Z" ],
        "liveness_info" : { "tstamp" : "2026-05-20T10:47:46.700742Z", "ttl" : 300, "expires_at" : "2026-05-20T10:52:46Z", "expired" : true },
        "cells" : [
          { "name" : "value", "deletion_info" : { "local_delete_time" : "2026-05-20T10:47:46Z" }
          }
        ]
      },
...
      {
        "type" : "row",
        "position" : 314,
        "clustering" : [ "2026-05-20 10:47:45.607Z" ],
        "liveness_info" : { "tstamp" : "2026-05-20T10:47:36.650198Z" },
        "cells" : [
          { "name" : "value", "value" : 27.23682555838757 }
        ]
      },
    ]
  }
]

 

So the data with position 256 has in the "liveness_info" a "ttl" value, an "expires_at" value and "expired" is true. The data with position 314 does not have these, so this is live data that should not be removed as it was not written with a TTL.

So the row without the TTL has a primary key of "sensor_002" with clustering column "2026-05-20 10:47:45.607Z".

 

There is a way to change the default behaviour with the parameter "unsafe_aggressive_sstable_expiration".

As the name suggests this should only be used if you know what issues it may cause.

Turning on this flag can cause correctness issues, such as the re-appearing of deleted data. See discussions in CASSANDRA-13418:
    https://issues.apache.org/jira/browse/CASSANDRA-13418

This will prevent the checks that are done to stop the removal of sstables from being performed. When an sstable only contains expired data, DSE/Cassandra will just remove them.

Note: This does not remove the sstables that are blocking the others, but just removed the sstables that are fully expired.

This can be added to the compaction settings:

AND compaction = {'class': 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy', 'compaction_window_size': '1', 'compaction_window_unit': 'DAYS', 'max_threshold': '32', 'min_threshold': '4', 'unsafe_aggressive_sstable_expiration': 'true'}

 

This parameter will only be used if the node is started with the following -D parameter in the cassandra-env.sh or jvm*.options file:

-D parameter cassandra.allow_unsafe_aggressive_sstable_expiration

 

Document Location

Worldwide

[{"Type":"MASTER","Line of Business":{"code":"LOB76","label":"Data Platform"},"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSIYC6","label":"DataStax Enterprise"},"ARM Category":[{"code":"a8mgJ0000000GPlQAM","label":"Drupal Knowledge Base Article"},{"code":"a8mgJ0000000G3BQAU","label":"How-To and Configuration"}],"ARM Case Number":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":""},{"Type":"MASTER","Line of Business":{"code":"LOB66","label":"Technology Lifecycle Services"},"Business Unit":{"code":"BU070","label":"IBM Infrastructure"},"Product":{"code":"SGMV15","label":"IBM Support for Apache Cassandra"},"ARM Category":[{"code":"a8mgJ0000000GPlQAM","label":"Drupal Knowledge Base Article"},{"code":"a8mgJ0000000G3BQAU","label":"How-To and Configuration"}],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":""}]

Document Information

Modified date:
27 May 2026

UID

ibm17274163