BMYSS1005E

Heartbeats are missing from the site, which could indicate network, resource, or configuration issues.

Severity

Error

User action

All nodes are missing heartbeats and you cannot reach them via TCP communication. Any of the following statements can be the fix for the problem:
  • Check network connectivity between the sites.
  • Run the following command to check the status of IBM Storage Scale pods on the other side.
    oc get pods -n ibm-spectrum-scale-operator
    If pods are down, check power loss on the entire site.
  • Check OpenShift® Container Platform cluster to see if worker nodes are down due to hardware failure or power down:
    oc get nodes
    Check operator logs to see why pods do not start. For guidance on how to look at the operator logs, see Debugging the IBM Spectrum Scale operator.
  • If pods do not start still, check the OpenShift Container Platform dashboard to see if nodes have run out of resources. If pods are in tuning state, then you do not have to check its resources.
  • Check whether there are physical network problems. For example, switch broken, submariner issues. In case of physical network problems, then contact your networking team.
If all above steps do not resolve the issue, then collect storage logs and contact IBM support.

For reference, the equivalent IBM Storage Scale error code is site_missing_heartbeats.