Replacing a failed node on VMware installer-provisioned infrastructure

Use this procedure to replace a failed node VMware installer-provisioned infrastructure.

Before you begin

  • Ensure that the replacement nodes are configured with similar infrastructure, resources, and disks to the node that you replace.
  • You must be logged into the OpenShift Container Platform cluster.

Procedure

  1. Log in to the OpenShift Web Console, and click Compute > Nodes.
  2. Identify the faulty node that you need to replace.
    Take note of its Machine Name.
  3. Identify the node, and get the labels on the node that you need to replace, where <node_name> specifies the name of the node that needs to be replaced.
    oc get nodes --show-labels | grep <node_name>
  4. Identify the monitor pod (mon) (if any), and OSDs that are running in the node that you need to replace.
    oc get pods -n openshift-storage -o wide | grep -i <node_name>
  5. Scale down the deployments of the pods identified in the previous step.
    oc scale deployment rook-ceph-mon-c --replicas=0 -n openshift-storage
    oc scale deployment rook-ceph-osd-0 --replicas=0 -n openshift-storage
    oc scale deployment --selector=app=rook-ceph-crashcollector,node_name=<node_name>  --replicas=0 -n openshift-storage
  6. Mark the node as unschedulable
    oc adm cordon <node_name>
  7. Remove the pods which are in Terminating state.
    oc get pods -A -o wide | grep -i <node_name> |  awk '{if ($4 == "Terminating") system ("oc -n " $1 " delete pods " $2  " --grace-period=0 " " --force ")}'
  8. Drain the node.
    oc adm drain <node_name> --force --delete-emptydir-data=true --ignore-daemonsets
  9. Click Compute > Machines and search for the required machine.
  10. For the required machine, click Action menu > Delete Machine.
  11. Click Delete to confirm that the machine is deleted.
    A new machine is automatically created.
  12. Wait for the new machine to start and transition into Running state.
    Important: This activity might take at least 5 - 10 minutes or more.
  13. Go to Compute > Nodes and confirm that the new node is in a Ready state.
  14. Physically add a new device to the node.
  15. Apply the Fusion Data Foundation label to the new node using one of the following steps:
    From the user interface
    1. Go to Action Menu > Edit Labels > .
    2. Add cluster.ocs.openshift.io/openshift-storage, and click Save.
    From the command-line interface
    Apply the Fusion Data Foundation label to the new node:, where <new_node_name> specifies the name of the new node.
    oc label node <new_node_name> cluster.ocs.openshift.io/openshift-storage=""
  16. Identify the namespace where OpenShift local storage operator is installed, and assign it to the local_storage_project variable.
    local_storage_project=$(oc get csv --all-namespaces | awk '{print $1}' | grep local)
    For example:
    local_storage_project=$(oc get csv --all-namespaces | awk '{print $1}' | grep local)
    echo $local_storage_project
    Example output:
    openshift-local-storage
  17. Add a new worker node to the localVolumeDiscovery and localVolumeSet.
    1. Update the localVolumeDiscovery definition to include the new node, and remove the failed node.
      oc edit -n $local_storage_project localvolumediscovery auto-discover-devices
      Example output, where server3.example.com is removed, and newnode.example.com is the new node:
      [...]
         nodeSelector:
          nodeSelectorTerms:
            - matchExpressions:
                - key: kubernetes.io/hostname
                  operator: In
                  values:
                  - server1.example.com
                  - server2.example.com
                  #- server3.example.com
                  - newnode.example.com
      [...]
      Remember: Save before exiting the editor.
    2. Determine the localVolumeSet to edit.
      oc get -n $local_storage_project localvolumeset
      Example output:
      NAME          AGE
      localblock   25h
    3. Update the localVolumeSet definition to include the new node, and remove the failed node.
      oc edit -n $local_storage_project localvolumeset localblock
      Example output, where server3.example.com is removed, and newnode.example.com is the new node:
      [...]
         nodeSelector:
          nodeSelectorTerms:
            - matchExpressions:
                - key: kubernetes.io/hostname
                  operator: In
                  values:
                  - server1.example.com
                  - server2.example.com
                  #- server3.example.com
                  - newnode.example.com
      [...]
      Remember: Save before exiting the editor.
  18. Verify that the new localblock Persistent Volume (PV) is available.
    oc get pv | grep localblock | grep Available
    local-pv-551d950     512Gi    RWO    Delete  Available
    localblock     26s
  19. Navigate to the openshift-storage project.
    oc project openshift-storage
  20. Remove the failed OSD from the cluster.
    Multiple failed OSDs can be specified, if required.
    oc process -n openshift-storage ocs-osd-removal \
    -p FAILED_OSD_IDS=<failed_osd_id> | oc create -f -
    <failed_osd_id>

    Is the integer in the pod name immediately after the rook-ceph-osd prefix.

    You can add comma separated OSD IDs in the command to remove more than one OSD, for example, FAILED_OSD_IDS=0,1,2.

    The FORCE_OSD_REMOVAL value must be changed to true in clusters that only have three OSDs, or clusters with insufficient space to restore all three replicas of the data after the OSD is removed.

  21. Verify that the OSD was removed successfully by checking the status of the ocs-osd-removal-job pod.
    A status of Completed confirms that the OSD removal job succeeded.
    oc get pod -l job-name=ocs-osd-removal-job -n openshift-storage
  22. Ensure that the OSD removal is completed.
    oc logs -l job-name=ocs-osd-removal-job -n openshift-storage --tail=-1 | egrep -i 'completed removal'
    Example output:
    2022-05-10 06:50:04.501511 I | cephosd: completed removal of OSD 0
    Important:

    If the ocs-osd-removal-job fails, and the pod is not in the expected Completed state, check the pod logs for further debugging:

    For example:

    oc logs -l job-name=ocs-osd-removal-job -n openshift-storage --tail=-1
  23. Identify the Persistent Volume (PV) associated with the Persistent Volume Claim (PVC).
    oc get pv -L kubernetes.io/hostname | grep localblock | grep Released
    Example output:
    local-pv-d6bf175b  1490Gi  RWO  Delete  Released  openshift-storage/ocs-deviceset-0-data-0-6c5pw  localblock  2d22h  compute-1
    If there is a PV in Released state, delete it:
    oc delete pv <persistent_volume>
    For example:
    oc delete pv local-pv-d6bf175b
    Example output:
    persistentvolume "local-pv-d9c5cbd6" deleted
  24. Identify the crashcollector pod deployment.
    oc get deployment --selector=app=rook-ceph-crashcollector,node_name=<failed_node_name> -n openshift-storage
    If there is an existing crashcollector pod deployment, delete it.
    oc delete deployment --selector=app=rook-ceph-crashcollector,node_name=<failed_node_name> -n openshift-storage
  25. Delete the ocs-osd-removal-job.
    oc delete -n openshift-storage job ocs-osd-removal-job
    Example output:
    job.batch "ocs-osd-removal-job" deleted

What to do next

  1. Verify that the new node is present in the output:
    oc get nodes --show-labels | grep cluster.ocs.openshift.io/openshift-storage= |cut -d' ' -f1
  2. Workloads > Pods and confirm that at least the following pods on the new node are in a Running state:
    • csi-cephfsplugin-*
    • csi-rbdplugin-*
  3. Verify that all the other required Fusion Data Foundation pods are in Running state.
  4. Ensure that the new incremental mon is created, and is in the Running state:
    oc get pod -n openshift-storage | grep mon

    Example output:

    rook-ceph-mon-a-cd575c89b-b6k66         2/2     Running
    0          38m
    rook-ceph-mon-b-6776bc469b-tzzt8        2/2     Running
    0          38m
    rook-ceph-mon-d-5ff5d488b5-7v8xh        2/2     Running
    0          4m8s

    OSD and monitor pod might take several minutes to get to the Running state.

  5. Verify that the new Object Storage Device (OSD) pods are running on the replacement node:
    oc get pods -o wide -n openshift-storage| egrep -i <new_node_name> | egrep osd
  6. If cluster-wide encryption is enabled on the cluster, verify that the new OSD devices are encrypted.

    For each of the new nodes identified in the previous step, do the following:

    1. Create a debug pod and open a chroot environment for the one or more selected hosts:
      oc debug node/<node_name>
      chroot /host
    2. Display the list of available block devices:, using the lsblk command.

      Check for the crypt keyword beside the one or more ocs-deviceset names.

  7. If the verification steps fail, contact IBM Support.