Cleaning up localhost node from the IBM Storage Fusion HCI System cluster

Follow the instructions to clean up the localhost node from the IBM Storage Fusion HCI System cluster.

About this task

The localhost node must not be added to the OpenShift® cluster, as it creates an issue at a later stage. A couple of OpenShift objects are created when the localhost node is added to the IBM Storage Fusion HCI System cluster.


Follow the steps to clean up the localhost node so that the node addition can retry after you resolve the DHCP or DNS misconfiguration issues.
  1. Run the following command to edit the ComputeProvisionWorker CR of the compute node that indicated an error message This node might contain an invalid hostname (localhost).
    # oc edit cpw provisionworker-compute-1-ru5
    Edit the respective CR to update the location information to an empty string to ensure that the ComputeProvisionWorker controller is not involved when you clean different objects of the compute node.
    # oc edit cpw provisionworker-compute-1-ru5
      location: ""
  2. Delete the respective machine object of the compute node.

    You need to identify the machine object first and then mark it with an annotation. Then, scale down the machineset object to delete the machine object.

    1. Run the following command to get the machine object name from the BMH object.
      # oc -n openshift-machine-api get bmh,machine
      Example output:
      # oc -n openshift-machine-api get bmh,machine
      NAME                                    STATE                    CONSUMER                           ONLINE   ERROR   AGE   provisioned              isf-rackae6-42ps4-worker-0-r6fmw   true             22h   provisioned              isf-rackae6-42ps4-worker-0-t922n   true             22h   provisioned              isf-rackae6-42ps4-worker-0-g842m   true             22h   externally provisioned   isf-rackae6-42ps4-master-0         true             44h   externally provisioned   isf-rackae6-42ps4-master-1         true             44h   externally provisioned   isf-rackae6-42ps4-master-2         true             44h
      NAME                                                            PHASE     TYPE   REGION   ZONE   AGE         Running                          44h         Running                          44h         Running                          44h   Running                          22h   Running                          22h   Running                          22h
      The BMH object of compute-1-ru5 maps to the isf-rackae6-42ps4-worker-0-r6fmw of the machine object.
    2. Mark the machine object to delete by a special annotation delete-me. The special marking helps to override the machine deletion policy rule.
      # oc -n openshift-machine-api edit machine isf-rackae6-42ps4-worker-0-r6fmw
      kind: Machine
    3. Now you need to scale down the machineset object so that the deletion of the machine object gets initiated.
      Note: After the machineset scale down is performed, the machine objects corresponding to compute-1-ru5 are cleaned up, and the status of the BMH object corresponding to compute-1-ru5 changes to deprovisioning.
      # oc -n openshift-machine-api get machineset
      NAME                         DESIRED   CURRENT   READY   AVAILABLE   AGE
      isf-rackae6-cltp4-worker-0   3         3         3       3           8h
      # oc -n openshift-machine-api get machineset -oyaml | grep replicas
       replicas: 3
      # oc -n openshift-machine-api scale --replicas=<old replica value - 1> machineset <machine set name> scaled
    4. After the machineset scale down is performed successfully, the status of the BMH object corresponding to compute-1-ru5 changes to ready.
      Important: This activity might take few minutes, and wait for it to be reflected against the BMH object before proceed with the further steps.
  3. Delete the compute nodes in the node object from the OpenShift Container Platform cluster if anything present.
    # oc get nodes
    NAME                                            STATUS   ROLES           AGE   VERSION   Ready    worker          21h   v1.23.17+16bcd69   Ready    worker          21h   v1.23.17+16bcd69   Ready    master,worker   43h   v1.23.17+16bcd69   Ready    master,worker   43h   v1.23.17+16bcd69   Ready    master,worker   43h   v1.23.17+16bcd69
    localhost                                       Ready    worker          20h   v1.23.17+16bcd69
    # oc delete node localhost 
  4. Delete the BMH object of the compute node.
     # oc -n openshift-machine-api get bmh 
    NAME            STATE                    CONSUMER                           ONLINE   ERROR   AGE
    compute-1-ru5   available                                                   false            24h
    compute-1-ru6   provisioned              isf-rackae6-42ps4-worker-0-t922n   true             24h
    compute-1-ru7   provisioned              isf-rackae6-42ps4-worker-0-g842m   true             24h
    control-1-ru2   externally provisioned   isf-rackae6-42ps4-master-0         true             46h
    control-1-ru3   externally provisioned   isf-rackae6-42ps4-master-1         true             46h
    control-1-ru4   externally provisioned   isf-rackae6-42ps4-master-2         true             46h
    # oc -n openshift-machine-api delete bmh compute-1-ru5 "compute-1-ru5" deleted
  5. Delete all pending CertificateSigningRequests corresponding to the localhost node.
    # for i in `oc get csr --no-headers | grep -i system:node:localhost | grep -i pending | awk '{ print $1 }'`;do oc delete csr $i; done
  6. Fix the DNS or DHCP issue to get the correct hostname for the corresponding compute node instead of the local host.
  7. Delete the ComputeProvisionWorker object of the compute node.
    # oc -n ibm-spectrum-fusion-ns get cpw 
    NAME                            AGE
    provisionworker-compute-1-ru5   24h
    provisionworker-compute-1-ru6   24h
    provisionworker-compute-1-ru7   24h
    # oc -n ibm-spectrum-fusion-ns delete cpw provisionworker-compute-1-ru5 "provisionworker-compute-1-ru5" deleted
  8. If the issue occurs during installation at the time of OpenShift configuration, the node conversion resumes automatically. If the issue occurs during the node upsize, then retry the node addition.