BMYSS1005E
Heartbeats are missing from the site, which could indicate network, resource, or configuration issues.
Severity
Error
User action
All nodes are missing heartbeats and you cannot reach them via TCP communication. Any of the
following statements can be the fix for the problem:
- Check network connectivity between the sites.
- Run the following command to check the status of IBM Storage Scale pods on the other side.
If pods are down, check power loss on the entire site.oc get pods -n ibm-spectrum-scale-operator
- Check OpenShift® Container Platform cluster to see if worker nodes
are down due to hardware failure or power down:
Check operator logs to see why pods do not start. For guidance on how to look at the operator logs, see Debugging the IBM Spectrum Scale operator.oc get nodes
- If pods do not start still, check the OpenShift Container Platform dashboard to see if nodes have run out of resources. If pods are in tuning state, then you do not have to check its resources.
- Check whether there are physical network problems. For example, switch broken, submariner issues. In case of physical network problems, then contact your networking team.
For reference, the equivalent IBM Storage Scale error code is site_missing_heartbeats
.