Replacing operational or failed storage devices on clusters backed by local storage devices

Use this information to replace operational or failed storage devices on clusters backed by local storage devices.

Before you begin

  • It is recommended that the replacement devices are configured with similar infrastructure and resources to the device being replaced.
  • Ensure that the data is resilient.
  • In the OpenShift Web Console, click Storage > Data Foundation.
  • Click the Storage Systems tab, and then click ocs-storagecluster-storagesystem.
  • In the Status card of Block and File dashboard, under the Overview tab, verify that Data Resiliency has a green tick mark.

About this task

You can replace an object storage device (OSD) in Fusion Data Foundation deployed using local storage devices on the following infrastructures:

  • Bare metal
  • VMware
  • Red Hat Virtualization
Note: One or more underlying storage devices may need to be replaced.

Procedure

  1. Remove the underlying storage device from relevant worker node.
  2. Verify that relevant OSD Pod has moved to CrashLoopBackOff state.
    Identify the OSD that needs to be replaced and the OpenShift Container Platform node that has the OSD scheduled on it.
    oc get -n openshift-storage pods -l app=rook-ceph-osd -o wide
    Example output:
    rook-ceph-osd-0-6d77d6c7c6-m8xj6    0/1    CrashLoopBackOff    0    24h   10.129.0.16   compute-2   <none>           <none>
    rook-ceph-osd-1-85d99fb95f-2svc7    1/1    Running             0    24h   10.128.2.24   compute-0   <none>           <none>
    rook-ceph-osd-2-6c66cdb977-jp542    1/1    Running             0    24h   10.130.0.18   compute-1   <none>           <none>
    In this example, rook-ceph-osd-0-6d77d6c7c6-m8xj6 needs to be replaced and compute-2 is the OpenShift Container platform node on which the OSD is scheduled.
  3. Scale down the OSD deployment for the OSD to be replaced.
    Each time you want to replace the OSD, update the osd_id_to_remove parameter with the OSD ID, and repeat this step.
    $ osd_id_to_remove=0
    oc scale -n openshift-storage deployment rook-ceph-osd-${osd_id_to_remove} --replicas=0
    where, osd_id_to_remove is the integer in the pod name immediately after the rook-ceph-osd prefix. In this example, the deployment name is rook-ceph-osd-0.
    Example output:
    deployment.extensions/rook-ceph-osd-0 scaled
  4. Verify that the rook-ceph-osd pod is terminated.
    oc get -n openshift-storage pods -l ceph-osd-id=${osd_id_to_remove}
    Example output:
    No resources found.
    Important: If the rook-ceph-osd pod is in terminating state, use the force option to delete the pod.
    oc delete pod rook-ceph-osd-0-6d77d6c7c6-m8xj6 --force --grace-period=0
    Example output:
    warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
      pod "rook-ceph-osd-0-6d77d6c7c6-m8xj6" force deleted
  5. Remove the old OSD from the cluster so that you can add a new OSD.
    1. Delete any old ocs-osd-removal jobs.
      oc delete -n openshift-storage job ocs-osd-removal-job
      job.batch "ocs-osd-removal-job" deleted
    2. Navigate to the openshift-storage project.
      oc project openshift-storage
    3. Remove the old OSD from the cluster.
      oc process -n openshift-storage ocs-osd-removal -p FAILED_OSD_IDS=${osd_id_to_remove} -p FORCE_OSD_REMOVAL=false |oc create -n openshift-storage -f -

      The FORCE_OSD_REMOVAL value must be changed to true in clusters that only have three OSDs, or clusters with insufficient space to restore all three replicas of the data after the OSD is removed.

      Warning: This step results in OSD being completely removed from the cluster. Ensure that the correct value of osd_id_to_remove is provided.
  6. Verify that the OSD was removed successfully by checking the status of the ocs-osd-removal-job pod.
    A status of Completed confirms that the OSD removal job succeeded.
    oc get pod -l job-name=ocs-osd-removal-job -n openshift-storage
  7. Ensure that the OSD removal is completed
    oc logs -l job-name=ocs-osd-removal-job -n openshift-storage --tail=-1 | egrep -i 'completed removal'
    Example output:
    2022-05-10 06:50:04.501511 I | cephosd: completed removal of OSD 0
    Important: If the ocs-osd-removal-job pod fails and the pod is not in the expected Completed state, check the pod logs for further debugging.
    For example:
    # oc logs -l job-name=ocs-osd-removal-job -n openshift-storage --tail=-1
  8. If encryption was enabled at the time of install, remove dm-crypt managed device-mapper mapping from the OSD devices that are removed from the respective Fusion Data Foundation nodes.
    1. Get the PVC name(s) of the replaced OSD(s) from the logs of ocs-osd-removal-job pod.
      oc logs -l job-name=ocs-osd-removal-job -n openshift-storage --tail=-1  |egrep -i ‘pvc|deviceset’
      2021-05-12 14:31:34.666000 I | cephosd: removing the OSD PVC "ocs-deviceset-xxxx-xxx-xxx-xxx"
    2. For each of the previously identified nodes, do the following:
      1. Create a debug pod and chroot to the host on the storage node, where <node name> is the name of the node.
        oc debug node/<node name>
        $ chroot /host
      2. Find a relevant device name based on the PVC names identified in the previous step, where <pvc name> is the name of the PVC..
        dmsetup ls| grep <pvc name>
        Example output:
        ocs-deviceset-xxx-xxx-xxx-xxx-block-dmcrypt (253:0)
      3. Remove the mapped device. Where <ocs-deviceset-name> is the name of the relevant device based on the PVC names identified in the previous step.
        $ cryptsetup luksClose --debug --verbose <ocs-deviceset-name>
        Important: If the above command gets stuck due to insufficient privileges, run the following commands:
        1. Press CTRL+Z to exit the above command.
        2. Find the PID of the process which was stuck.
          $ ps -ef | grep crypt
        3. Terminate the process using the kill command.
          kill -9 <PID>

          where <PID>is the process ID.

        4. Verify that the device name is removed.
          $ dmsetup ls
  9. Find the persistent volume (PV) that need to be deleted.
    oc get pv -L kubernetes.io/hostname | grep localblock | grep Released
    Example output:
    local-pv-d6bf175b   1490Gi   RWO   Delete     Released     openshift-storage/ocs-deviceset-0-data-0-6c5pw      localblock      2d22h       compute-1
  10. Delete the PV.
    oc delete pv <pv_name>
  11. Physically add a new device to the node.
  12. Track the provisioning of PVs for the devices that match the deviceInclusionSpec. It can take a few minutes to provision the PVs.
    oc -n openshift-local-storage describe localvolumeset localblock
    Example output:
    [...]
    Status:
      Conditions:
        Last Transition Time:          2020-11-17T05:03:32Z
        Message:                       DiskMaker: Available, LocalProvisioner: Available
        Status:                        True
        Type:                          DaemonSetsAvailable
        Last Transition Time:          2020-11-17T05:03:34Z
        Message:                       Operator reconciled successfully.
        Status:                        True
        Type:                          Available
      Observed Generation:             1
      Total Provisioned Device Count: 4
    Events:
    Type    Reason      Age          From                Message
    ----    ------      ----         ----                -------
    Normal  Discovered  2m30s (x4    localvolumeset-     node.example.com -
            NewDevice   over 2m30s)  symlink-controller  found possible
                                                         matching disk,
                                                         waiting 1m to claim
    
    Normal  FoundMatch  89s (x4      localvolumeset-     node.example.com -
            ingDisk     over 89s)    symlink-controller  symlinking matching
                                                         disk

    Once the PV is provisioned, a new OSD pod is automatically created for the PV.

  13. Delete the ocs-osd-removal job(s).
    oc delete -n openshift-storage job ocs-osd-removal-job
    Example output:
    job.batch "ocs-osd-removal-job" deleted
    Note: When using an external key management system (KMS) with data encryption, the old OSD encryption key can be removed from the Vault server as it is now an orphan key.

What to do next

  1. Verify that there is a new OSD running.
    oc get -n openshift-storage pods -l app=rook-ceph-osd
    Example output:
    rook-ceph-osd-0-5f7f4747d4-snshw                                  1/1     Running     0          4m47s
               rook-ceph-osd-1-85d99fb95f-2svc7                                  1/1     Running     0          1d20h                            
               rook-ceph-osd-2-6c66cdb977-jp542                                  1/1     Running     0          1d20h
    Important: If the new OSD does not show as Running after a few minutes, restart the rook-ceph-operator pod to force a reconciliation.
    oc delete pod -n openshift-storage -l app=rook-ceph-operator
    Example output:
    pod "rook-ceph-operator-6f74fb5bff-2d982" deleted
  2. Verify that a new PVC is created.
    oc get -n openshift-storage pvc | grep localblock
    Example output:
    ocs-deviceset-0-0-c2mqb   Bound    local-pv-b481410         1490Gi     RWO            localblock                    5m
    ocs-deviceset-1-0-959rp   Bound    local-pv-414755e0        1490Gi     RWO            localblock                    1d20h
    ocs-deviceset-2-0-79j94   Bound    local-pv-3e8964d3        1490Gi     RWO            localblock                    1d20h
  3. If cluster-wide encryption is enabled on the cluster, verify that the new OSD devices are encrypted.
    1. Identify the nodes where the new OSD pods are running.
      oc get -n openshift-storage -o=custom-columns=NODE:.spec.nodeName pod/<OSD-pod-name>
      where <OSD-pod-name>is the name of the OSD pod.
      For example:
      oc get -n openshift-storage -o=custom-columns=NODE:.spec.nodeName pod/rook-ceph-osd-0-544db49d7f-qrgqm
      Example output:
      NODE
      compute-1
    2. For each of the nodes identified in the previous step, do the following:
      1. Create a debug pod and open a chroot environment for the selected host(s).
        oc debug node/<node name>
        where <node name> is the name of the node.
        $ chroot /host
      2. Check for the crypt keyword beside the ocs-deviceset name(s).
        $ lsblk
  4. Log in to OpenShift Web Console and view the storage dashboard.
Note: A full data recovery may take longer depending on the volume of data being recovered.