Issues related to IBM Fusion HCI System node drains

Use these common troubleshooting tips and tricks when you work with IBM Fusion HCI System.

Issues related to draining IBM Fusion HCI System node

Pod Distribution Budget (PDB) limits the number of Pods of a replicated application that are down simultaneously from voluntary disruptions. Scale uses PDB to ensure that storage does not get corrupted and minimum number nodes are available. You may experience delays or failure in node drain or node restart during any of the following operations:
Note: You must never reboot a node before it is drained successfully. Always perform node power operations from the IBM Fusion HCI System user interface as it ensures successful node drain.
  • Configuration updates
    • Red Hat® OpenShift® Machine Config Operator (MCO)
    • IBM Fusion component specification changes
  • Upgrade
    • OpenShift Container Platform upgrades
    • IBM Fusion software upgrades
    • Firmware upgrades
  • User Initiated
    • Maintenance operation
    • kdump enabling/disabling
    • Operations that result in node restart
    • Node drain
Cause
The PDB configuration is set by Scale to control the node drain so that the Scale cluster remains healthy always, and you can drain only an allowed quantity of nodes at any time. For more details, see Scale behavior during node restarts.
Resolution
When node drain hangs due to any of the suggested operations, then do the following steps:
  1. Run the following command to identify the pod that causes the issue.
    oc describe cmt <target node name> -n ibm-spectrum-fusion-ns
    The command lists pods pending for eviction.
  2. Go through events of the node having the issue:
    Scenario where drains are prevented by an application
    If drains are prevented by an application, consult the owner of the application to proceed with the drain, and manually drain the node. For more information about the issue and to troubleshoot, see Identifying applications preventing cluster maintenance.
    Scenario where drains are prevented by VM
    Check Identifying applications preventing cluster maintenance to determine the node that is waiting for reboot and check whether live migration is set up properly to allow VM to migrate to a different node. For more details about setting up live migrate, see Virtual machine live migration.
    Scenarios where issue is due to node maintenance
    • The maintenance mode on a node can take from four minutes to thirty minutes to succeed. If the maintenance mode operation is taking more than 30 minutes, then Fusion continues to retry draining the pods with a warning event BMYCO0012, and gets timed out eventually. If you want to stop the retries, then delete the Computemaintenance CR instance.
      Run the following command to delete Computemaintenance CR instance.
      oc delete cmt <instnace-nodename> -n ibm-spectrum-fusion-ns
    • If it takes a long time and you want to continue the operation, check the pod that is pending by using the oc get pods command. If the pod name belongs to IBM Storage Scale, see Identifying applications preventing cluster maintenance. If issue persists, collect Scale system health and Scale logs, and contact IBM support.
    If you want to understand Scale behavior during node reboots, see Scale behavior during node restarts section.

Scale behavior during node restarts

Restart of a node can happen due to various reasons such as an MCO roll out, user initiated, or as part of firmware or software upgrades. Scale has a maximum tolerance on the number of nodes that are in unavailable or in not ready state. It is based on the erasure code that is configured for the storage cluster. To stay within this tolerance, Scale uses the POD Disruption Budget (PDB) feature of OpenShift.

Scale CNSA implements PDB with a "maxUnavailable=0", which means that it does not allow a Scale pod to go down without its knowledge. Even with a "maxUnavailable=0" PDB, the design does allow for cluster updates. In this configuration, MCO is prevented from taking down the node and draining the CNSA core pod. The Scale CNSA pod detects the operation, drains the applications, and then exits itself, thus freeing the node up for the operation to continue.

If any application refuses to unmount, or if the scale core pod itself determines the cluster, then it goes into a bad or hung state. If it goes down, then things pause until the condition gets resolved.

For more information about identifying applications that prevent cluster maintenance, see Identifying applications preventing cluster maintenance.