IBM Support

Kafka volume corruption on Gluster storage - recreating the volume

Troubleshooting


Problem

After a storage outage, the Kafka pod is repeatedly crashing, which causes other Pods to fail and the Turbonomic UI to be unavailable.

Symptom

Kafka error logs show the following error messages within the logs:
 
ERROR Error while writing to checkpoint file /home/kafka/data/replication-offset-checkpoint

java.io.FileNotFoundException: /home/kafka/data/replication-offset-checkpoint.tmp (Stale file handler)

Document Location

Worldwide

[{"Type":"MASTER","Line of Business":{"code":"LOB77","label":"Automation Platform"},"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSFV9Z","label":"IBM Turbonomic Application Resource Management"},"ARM Category":[{"code":"a8m3p000000PCRRAA4","label":"System"}],"ARM Case Number":"TS013073419","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"}]

Log InLog in to view more of this document

This document has the abstract of a technical article that is available to authorized users once you have logged on. Please use Log in button above to access the full document. After log in, if you do not have the right authorization for this document, there will be instructions on what to do next.

Document Information

Modified date:
12 June 2023

UID

ibm17002727