Replacing a failed node on an IBM Z infrastructure user provisioned infrastructure

Use this procedure if you need to replace and configure failed nodes.

Before you begin

  • IBM recommends that replacement nodes are configured with similar infrastructure, resources, and disks to the node being replaced.
  • You must be logged into the OpenShift Container Platform (RHOCP) cluster.

Procedure

  1. Identify the node and get labels on the node to be replaced. Make a note of the rack label.
    oc get nodes --show-labels | grep <node_name>
  2. Identify the mon (if any) and object storage device (OSD) pods that are running in the node to be replaced.
    oc get pods -n openshift-storage -o wide | grep -i <node_name>
  3. Scale down the deployments of the pods identified in the previous step.

    For example:

    oc scale deployment rook-ceph-mon-c --replicas=0 -n openshift-storage
    oc scale deployment rook-ceph-osd-0 --replicas=0 -n openshift-storage
    oc scale deployment --selector=app=rook-ceph-crashcollector,node_name=<node_name>  --replicas=0 -n openshift-storage
  4. Mark the node as unschedulable.
    oc adm cordon <node_name>
  5. Remove the pods which are in Terminating state.
    oc get pods -A -o wide | grep -i <node_name> |  awk '{if ($4 == "Terminating") system ("oc -n " $1 " delete pods " $2  " --grace-period=0 " " --force ")}'
  6. Drain the node.
    oc adm drain <node_name> --force --delete-emptydir-data=true --ignore-daemonsets
  7. Delete the node.
    oc delete node <node_name>
  8. Get a new IBM Z infrastructure machine with required infrastructure.
  9. Create a new OpenShift Container Platform node using the new IBM Z infrastructure machine.
  10. Check for certificate signing requests (CSRs) related to Fusion Data Foundation that are in Pending state:
    oc get csr
  11. Approve all required Fusion Data Foundation CSRs for the new node:
    oc adm certificate approve <Certificate_Name>
  12. Click Compute > Nodes in OpenShift Web Console, confirm if the new node is in Ready state.
  13. Apply the Fusion Data Foundation label to the new node using any one of the following:
    From User interface
    1. For the new node, click Action Menu > Edit Labels

    2. Add cluster.ocs.openshift.io/openshift-storage and click Save.

    From Command line interface
    • Execute the following command to apply the Fusion Data Foundation label to the new node:

    oc label node <new_node_name> cluster.ocs.openshift.io/openshift-storage=""
  14. Add a new worker node to localVolumeDiscovery and localVolumeSet.
    1. Update the localVolumeDiscovery definition to include the new node and remove the failed node.
       oc edit -n local-storage-project localvolumediscovery auto-discover-devices
      [...]
         nodeSelector:
          nodeSelectorTerms:
            - matchExpressions:
                - key: kubernetes.io/hostname
                  operator: In
                  values:
                  - server1.example.com
                  - server2.example.com
                  #- server3.example.com
                  - newnode.example.com
      [...]

      Remember to save before exiting the editor.

      In the above example, server3.example.com was removed and newnode.example.com is the new node.

    2. Determine which localVolumeSet to edit.
      Replace local-storage-project in the following commands with the name of your local storage project. The default project name is openshift-local-storage in Fusion Data Foundation 4.6 and later. Previous versions use local-storage by default.
       oc get -n local-storage-project localvolumeset
      NAME          AGE
      localblock   25h
    3. Update the localVolumeSet definition to include the new node and remove the failed node.
       oc edit -n local-storage-project localvolumeset localblock
      [...]
         nodeSelector:
          nodeSelectorTerms:
            - matchExpressions:
                - key: kubernetes.io/hostname
                  operator: In
                  values:
                  - server1.example.com
                  - server2.example.com
                  #- server3.example.com
                  - newnode.example.com
      [...]

      Remember to save before exiting the editor.

      In the above example, server3.example.com was removed and newnode.example.com is the new node.

  15. Verify that the new localblock PV is available.
     oc get pv | grep localblock
              CAPA- ACCESS RECLAIM                                STORAGE
    NAME      CITY  MODES  POLICY  STATUS     CLAIM               CLASS       AGE
    local-pv- 931Gi  RWO   Delete  Bound      openshift-storage/  localblock  25h
    3e8964d3                                  ocs-deviceset-2-0
                                              -79j94
    local-pv- 931Gi  RWO   Delete  Bound      openshift-storage/  localblock  25h
    414755e0                                  ocs-deviceset-1-0
                                              -959rp
    local-pv- 931Gi  RWO   Delete  Available                      localblock  3m24s
    b481410
    local-pv- 931Gi  RWO   Delete  Bound      openshift-storage/  localblock  25h
    d9c5cbd6                                  ocs-deviceset-0-0
                                              -nvs68
  16. Change to the openshift-storage project.
    oc project openshift-storage
  17. Remove the failed OSD from the cluster. You can specify multiple failed OSDs if required.
    oc process -n openshift-storage ocs-osd-removal \
    -p FAILED_OSD_IDS=failed-osd-id1,failed-osd-id2 | oc create -f -
  18. Verify that the OSD was removed successfully by checking the status of the ocs-osd-removal pod.

    A status of Completed confirms that the OSD removal job succeeded.

    oc get pod -l job-name=ocs-osd-removal-job -n openshift-storage
  19. Ensure that the OSD removal is completed.
    oc logs -l job-name=ocs-osd-removal-job -n openshift-storage --tail=-1 | egrep -i 'completed removal'

    Example output:

    2022-05-10 06:50:04.501511 I | cephosd: completed removal of OSD 0
    Important:

    If ocs-osd-removal fails and the pod is not in the expected Completed state, check the pod logs for further debugging.

    For example:

    oc logs -l job-name=ocs-osd-removal-job -n openshift-storage --tail=-1
  20. Delete the PV associated with the failed node.
    1. Identify the PV associated with the PVC.
       oc get pv -L kubernetes.io/hostname | grep localblock | grep Released
      local-pv-d6bf175b  1490Gi  RWO  Delete  Released  openshift-storage/ocs-deviceset-0-data-0-6c5pw  localblock  2d22h  compute-1
    2. Delete the PV.
      oc delete pv <persistent-volume>
      For example:
       oc delete pv local-pv-d6bf175b
      persistentvolume "local-pv-d9c5cbd6" deleted
  21. Delete the crashcollector pod deployment.
    oc delete deployment --selector=app=rook-ceph-crashcollector,node_name=failed-node-name -n openshift-storage
  22. Delete the ocs-osd-removal job.
    oc delete job ocs-osd-removal-${osd_id_to_remove}

    Example output:

    job.batch "ocs-osd-removal-0" deleted

What to do next

  1. Execute the following command and verify that the new node is present in the output:

    oc get nodes --show-labels | grep cluster.ocs.openshift.io/openshift-storage= |cut -d' ' -f1
  2. Click Workloads > Pods, confirm that at least the following pods on the new node are in Running state:

    • csi-cephfsplugin-*

    • csi-rbdplugin-*

  3. Verify that all other required Fusion Data Foundation pods are in Running state.

    Make sure that the new incremental mon is created and is in the Running state.

    oc get pod -n openshift-storage | grep mon

    Example output:

    rook-ceph-mon-c-64556f7659-c2ngc                                  1/1     Running     0          6h14m
    rook-ceph-mon-d-7c8b74dc4d-tt6hd                                  1/1     Running     0          4h24m
    rook-ceph-mon-e-57fb8c657-wg5f2                                   1/1     Running     0          162m

    OSD and Mon might take several minutes to get to the Running state.

  4. Verify that new OSD pods are running on the replacement node.

    oc get pods -o wide -n openshift-storage| egrep -i new-node-name | egrep osd
  5. Optional: If cluster-wide encryption is enabled on the cluster, verify that the new OSD devices are encrypted.

    For each of the new nodes identified in previous step, do the following:

    1. Create a debug pod and open a chroot environment for the selected host(s).

      oc debug node/<node name>
      $ chroot /host
    2. Run “lsblk” and check for the “crypt” keyword beside the ocs-deviceset name(s)

      $ lsblk
  6. If verification steps fail, contact Red Hat Support.