Troubleshooting
Problem
Symptom
Stream failure error occurs when either a node is being decommissioned, replaced with a new node, or after bootstrapping in some cases. The error log will look like this:
INFO [StreamReceiveTask:5341] 2020-07-31 05:56:48,620 StreamResultFuture.java:180 - [Stream #44008ac0-d234-11ea-b48c-e94aabceab9f] Session with /10.192.170.115 is complete WARN [StreamReceiveTask:5341] 2020-07-31 05:56:48,627 StreamResultFuture.java:207 - [Stream #44008ac0-d234-11ea-b48c-e94aabceab9f] Stream failed ERROR [main] 2020-07-31 05:56:48,628 CassandraDaemon.java:583 - Exception encountered during startup java.lang.RuntimeException: Error during boostrap: Stream failed
Then, in Apache Cassandra logs, there is an occurrence of broken pipe errors:
ERROR [STREAM-OUT-/10.192.148.41] 2020-07-30 09:31:56,582 StreamSession.java:515 - [Stream #44008ac0-d234-11ea-b48c-e94aabceab9f] Streaming error occurred java.io.IOException: Broken pipe
Analysis
Stream failure can occur due to a variety of reasons:- Network failures
- Overloaded or under-provisioned nodes
- Running repairs
- Long GC pauses
- SStable corruption
The broken pipe exception suggests that the streaming failure is due to node connectivity problems, which categorizes this error under network failures.
A pipe connects two processes as a stream. One of these processes holds the read-end of the pipe, and the other holds the write-end. When the pipe is written to, data is stored in a buffer, waiting for the other processes to retrieve it. If, during either the read or write process, one end of the pipe disconnects, whether it be the read-end or write-end, the pipe process gets broken, causing the streaming failure to occur as a Broken pipe exception.
Network outages or outages from traffic congestion can cause Broken pipe issues. For most C*/DSE use cases, broken pipes occur when a node is being replaced or bootstrapped, and outages happen during the bootstrap or rebuild process.
The best course of action is to identify these outages first, ensure that network outages do not occur in the future, and retry the intended node processes.
Solution
If the network disconnect is a temporary problem or due to congested traffic, the intended processes, such as decommissioning and replacing nodes, can be retried again or during off-peak hours. If retries still fail while the network is up and running, perform a rolling restart of the cluster and retry.
If an outage occurs during a node bootstrapping and that previous outage is causing the current broken pipe connectivity issue, re-bootstrap the node to ensure that the nodes are fully connected.
Document Location
Worldwide
Historical Number
ka0Ui0000000M5NIAU
Was this topic helpful?
Document Information
Modified date:
30 January 2026
UID
ibm17258639