IBM Support

How to performance tune data streaming activities like repair and bootstrap

Troubleshooting


Problem

Summary

The performance can be tuned During streaming processes like repair and bootstrap. You can throttle performance if your nodes become overloaded and unthrottle, allowing repair or bootstrap to complete more quickly.


Symptoms

While repair or bootstrap is running, if the top output shows your nodes are under load, the iostat command shows lots of heavy I/O on your disks, and your network utilization is high, you may want to throttle the repair or bootstrap process.


On the contrary, you may find that repair or bootstrap is running, the top shows minimal load on the nodes, the iostat shows little I/O on your disks, and the network has lots of available bandwidth. Under these circumstances, you may want to unthrottle performance to allow repair or bootstrap to be completed more quickly.


Cause

The repair process performs validation compaction and streams data from other nodes in the cluster. The bootstrap process only streams data from other nodes in the cluster. The cassandra.yaml contains these parameters that affect the rate of compaction and stream throughput on the network:
 

compaction_throughput_mb_per_sec (Default 16)
stream_throughput_outbound_megabits_per_sec (Default 200)

 

The default values may be configured too high or too low depending on your environment.


Solution

You are changing the compaction_throughput_mb_per_sec and stream_throughput_outbound_megabits_per_sec parameters in the cassandra.yaml requires a restart of DSE for the change to be picked up. However, you can adjust these parameters on the fly using these nodetool commands:

nodetool setcompactionthroughput
nodetool setstreamthroughput

 

Setting both of these parameters to 0 unthrottles compaction and data streaming on the network, but you need to be careful not to overload your nodes. Set the stream_throughput_outbound_megabits_per_sec parameter to the same value on all your nodes because, as stated by the name, it tunes the outbound traffic from the node.


You can obtain the current setting for these values using these nodetool commands:

nodetool getcompactionthroughput
nodetool getstreamthroughput


To determine the right values for these parameters, make small adjustments and monitor the effect on your nodes before making further changes. Once the repair or bootstrap process is complete, you may want to revert these parameters to their default values.
 

Last Reviewed Date: 2023/12/29

Document Location

Worldwide

[{"Type":"MASTER","Line of Business":{"code":"LOB76","label":"Data Platform"},"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSCR56","label":"IBM DataStax Enterprise"},"ARM Category":[{"code":"","label":""}],"ARM Case Number":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Version(s)"}]

Historical Number

ka0Ui0000000R0DIAU

Document Information

Modified date:
30 January 2026

UID

ibm17258546