Red Hat OpenShift Data Foundation Object Storage Device (OSD) failure
For any kind of failed storage devices on the clusters backed by local storage devices, you must replace the Red Hat® OpenShift® Data Foundation Object Storage Device (OSD).
If you encounter this issue, contact IBM support .
- Before you begin
- Red Hat recommends that replacement OSD
devices are configured with similar infrastructure and resources to the device being replaced.
You can replace an OSD in Red Hat OpenShift Data Foundation deployed using local storage devices on the following infrastructures:
- Bare Metal
- VMware with local deployment
- SystemZ
- IBM Power Systems
- Procedure
-
Do the following steps to check for the occurrence of Red Hat OpenShift Data Foundation OSD failure:
- Set the Red Hat OpenShift Data Foundation cluster to maintenance:
oc label odfclusters.odf.isf.ibm.com -n ibm-spectrum-fusion-ns odfcluster "odf.isf.ibm.com/maintenanceMode=true"
Example output:[root@fu40 ~]# oc label odfclusters.odf.isf.ibm.com -n ibm-spectrum-fusion-ns odfcluster "odf.isf.ibm.com/maintenanceMode=true" odfcluster.odf.isf.ibm.com/odfcluster labeled
- Identify the failed OSD: check whether the OSD failed by using any of the following methods:
- Log in to Red Hat OpenShift Container Platform web console and go to your storage system details page.
- In the Status section for any warning in the Storage cluster. tab, check the
- If the warnings indicate OSD down or degraded, then contact IBM support to replace the Red Hat OpenShift Data Foundation failed OSD for your storage node in an
internal-attached environment. Example warning message:
1 osds down 1 host (1 osds) down Degraded data redundancy: 333/999 objects degaded (33.333%), 81 pgs degraded
- Log in to IBM Storage Fusion user interface.
- Go to Data foundation page and check for warnings in the Health section for storage cluster.
Alternatively, you can use the oc command to identify the OSD:oc get -n openshift-storage pods -l app=rook-ceph-osd -o wide
Sample output:[root@fu40 ~]# oc get -n openshift-storage pods -l app=rook-ceph-osd -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES rook-ceph-osd-0-6c99fc999b-2s9mr 1/2 CrashLoopBackOff 5 (17s ago) 17m 10.128.4.216 fu49 <none> <none> rook-ceph-osd-1-764f9cff48-6gkg9 2/2 Running 0 16m 10.131.2.18 fu47 <none> <none> rook-ceph-osd-2-5d9d5984dc-8gkrz 2/2 Running 0 16m 10.129.2.53 fu48 <none> <none>
In this example,
rook-ceph-osd-0-6c99fc999b-2s9mr
needs to be replaced andfu49
is the Red Hat OpenShift Container Platform node on which the OSD is scheduled. And the failed OSD id is0
.You can view the OSD details as well
ceph osd df
in the Ceph tools. And the failed OSD id is the same as in previous step. - Scale down the OSD deployment
- Scale down the OSD deployment replica to 0
- Verify the OSD id from previous step, the
rook-ceph-osd-0-6c99fc999b-2s9mr
and pod id is0
.osd_id_to_remove=<replace-it-with-osd-id> oc scale -n openshift-storage deployment rook-ceph-osd-${osd_id_to_remove} --replicas=0
Example output:[root@fu40 ~]# osd_id_to_remove=0 [root@fu40 ~]# oc scale -n openshift-storage deployment rook-ceph-osd-${osd_id_to_remove} --replicas=0 deployment.apps/rook-ceph-osd-0 scaled
- Waiting to the rook-ceph-osd pod is terminated
-
Run the oc command to terminate
rook-ceph-osd
pod.oc get -n openshift-storage pods -l ceph-osd-id=${osd_id_to_remove}
Example output:[root@fu40 ~]# oc get -n openshift-storage pods -l ceph-osd-id=${osd_id_to_remove} NAME READY STATUS RESTARTS AGE rook-ceph-osd-0-6c99fc999b-2s9mr 0/2 Terminating 6 20m
Note: If therook-ceph-osd
pod is in terminating state and taking more time, then use the force option to delete the pod.oc delete -n openshift-storage pod rook-ceph-osd-0-6c99fc999b-2s9mr --grace-period=0 --force
Example output:[root@fu40 ~]# oc delete -n openshift-storage pod rook-ceph-osd-0-6c99fc999b-2s9mr --grace-period=0 --force warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely. pod "rook-ceph-osd-0-6c99fc999b-2s9mr" force deleted
Verify whetherrook-ceph-osd
is terminated.[root@fu40 ~]# oc get -n openshift-storage pods -l ceph-osd-id=${osd_id_to_remove} No resources found in openshift-storage namespace.
- Remove the old OSD from the cluster.
- Delete any old
ocs-osd-removal
jobs -
Run the oc command to delete
ocs-osd-removal
jobs.oc delete -n openshift-storage job ocs-osd-removal-job
- Remove the old OSD from the cluster
-
Ensure that you set the correct
osd_id_to_remove
.The FORCE_OSD_REMOVAL value must be changed to
true
in clusters that only have three OSDs, or clusters with insufficient space to restore all three replicas of the data after the OSD is removed.- More than three OSDs
oc process -n openshift-storage ocs-osd-removal -p FAILED_OSD_IDS=${osd_id_to_remove} FORCE_OSD_REMOVAL=true |oc create -n openshift-storage -f -
- Only three OSDs or insufficient space (force
delete)
oc process -n openshift-storage ocs-osd-removal -p FAILED_OSD_IDS=${osd_id_to_remove} FORCE_OSD_REMOVAL=true |oc create -n openshift-storage -f -
Example output:[root@fu40 ~]# echo $osd_id_to_remove 0
- More than three OSDs
- Verify the OSD is removed
-
Wait for the
ocs-osd-removal-job
pod is completed.[root@fu40 ~]# oc get pod -l job-name=ocs-osd-removal-job -n openshift-storage NAME READY STATUS RESTARTS AGE ocs-osd-removal-job-s4vhc 0/1 Completed 0 24s
Double confirm the logs.[root@fu40 ~]# oc logs -l job-name=ocs-osd-removal-job -n openshift-storage --tail=-1 | egrep -i 'completed removal' 2022-11-25 16:08:49.858109 I | cephosd: completed removal of OSD 0
The PVC will go toPending
, and the pv will beReleased
.openshift-storage ocs-deviceset-ibm-spectrum-fusion-local-0-data-3nsk8j Pending ibm-spectrum-fusion-local 7m16s
local-pv-a2879220 600Gi RWO Delete Released openshift-storage/ocs-deviceset-ibm-spectrum-fusion-local-0-data-1m227b ibm-spectrum-fusion-local 41m
To locate the worker node, use the oc command to describe pv.
For example, pv host name is fu49kubernetes.io/hostname=fu49
.[root@fu40 ~]# oc describe pv local-pv-a2879220 Name: local-pv-a2879220 Labels: kubernetes.io/hostname=fu49 storage.openshift.com/owner-kind=LocalVolumeSet storage.openshift.com/owner-name=ibm-spectrum-fusion-local storage.openshift.com/owner-namespace=openshift-local-storage Annotations: pv.kubernetes.io/bound-by-controller: yes pv.kubernetes.io/provisioned-by: local-volume-provisioner-fu49-96f64c0f-e5ed-4bb1-b4ff-cad610562f58 storage.openshift.com/device-id: scsi-36000c2913ba6a22c66120c73cb1edae6 storage.openshift.com/device-name: sdb Finalizers: [kubernetes.io/pv-protection] StorageClass: ibm-spectrum-fusion-local Status: Released Claim: openshift-storage/ocs-deviceset-ibm-spectrum-fusion-local-0-data-1m227b Reclaim Policy: Delete Access Modes: RWO VolumeMode: Block Capacity: 600Gi Node Affinity: Required Terms: Term 0: kubernetes.io/hostname in [fu49] Message: Source: Type: LocalVolume (a persistent volume backed by local storage on a node) Path: /mnt/local-storage/ibm-spectrum-fusion-local/scsi-36000c2913ba6a22c66120c73cb1edae6 Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning VolumeFailedDelete 6m2s (x26 over 12m) deleter Error cleaning PV "local-pv-a2879220": failed to get volume mode of path "/mnt/local-storage/ibm-spectrum-fusion-local/scsi-36000c2913ba6a22c66120c73cb1edae6": Directory check for "/mnt/local-storage/ibm-spectrum-fusion-local/scsi-36000c2913ba6a22c66120c73cb1edae6" failed: open /mnt/local-storage/ibm-spectrum-fusion-local/scsi-36000c2913ba6a22c66120c73cb1edae6: no such file or directory
Note: If the ocs-osd-removal-job fails and the pod is not in the expected completed state, check the pod logs for further debugging.
- Remove Encryption related configuration
- Remove the dm-crypt managed device-mapper mapping from the OSD devices that
are removed from the respective Red Hat OpenShift Data Foundation nodes if
encryption was enabled during installation.
- For each of the previously identified nodes, do the
following:
oc debug node/<node name> chroot /host dmsetup ls| grep <pvc name>
- Remove the mapped
device.
cryptsetup luksClose --debug --verbose ocs-deviceset-xxx-xxx-xxx-xxx-block-dmcrypt
Example output:[root@fu40 ~]# oc debug nodes/fu49 Starting pod/fu49-debug ... To use host binaries, run `chroot /host` If you don't see a command prompt, try pressing enter. sh-4.4# chroot /host sh-4.4# dmsetup ls ocs-deviceset-ibm-spectrum-fusion-local-0-data-1m227b-block-dmcrypt (253:0) sh-4.4# cryptsetup luksClose --debug --verbose ocs-deviceset-ibm-spectrum-fusion-local-0-data-1m227b-block-dmcrypt
# cryptsetup 2.3.3 processing "cryptsetup luksClose --debug --verbose ocs-deviceset-ibm-spectrum-fusion-local-0-data-1m227b-block-dmcrypt" # Running command close. # Locking memory. # Installing SIGINT/SIGTERM handler. # Unblocking interruption on signal. # Allocating crypt device context by device ocs-deviceset-ibm-spectrum-fusion-local-0-data-1m227b-block-dmcrypt. # Initialising device-mapper backend library. # dm version [ opencount flush ] [16384] (*1) # dm versions [ opencount flush ] [16384] (*1) # Detected dm-ioctl version 4.43.0. # Detected dm-crypt version 1.21.0. # Device-mapper backend running with UDEV support enabled. # dm status ocs-deviceset-ibm-spectrum-fusion-local-0-data-1m227b-block-dmcrypt [ opencount noflush ] [16384] (*1) # Releasing device-mapper backend. # Allocating context for crypt device (none). # Initialising device-mapper backend library. Underlying device for crypt device ocs-deviceset-ibm-spectrum-fusion-local-0-data-1m227b-block-dmcrypt disappeared. # dm versions [ opencount flush ] [16384] (*1) # dm table ocs-deviceset-ibm-spectrum-fusion-local-0-data-1m227b-block-dmcrypt [ opencount flush securedata ] [16384] (*1) # dm versions [ opencount flush ] [16384] (*1) # dm deps ocs-deviceset-ibm-spectrum-fusion-local-0-data-1m227b-block-dmcrypt [ opencount flush ] [16384] (*1) # LUKS device header not available. # Deactivating volume ocs-deviceset-ibm-spectrum-fusion-local-0-data-1m227b-block-dmcrypt. # dm versions [ opencount flush ] [16384] (*1) # dm status ocs-deviceset-ibm-spectrum-fusion-local-0-data-1m227b-block-dmcrypt [ opencount noflush ] [16384] (*1) # dm versions [ opencount flush ] [16384] (*1) # dm table ocs-deviceset-ibm-spectrum-fusion-local-0-data-1m227b-block-dmcrypt [ opencount flush securedata ] [16384] (*1) # dm versions [ opencount flush ] [16384] (*1) # dm deps ocs-deviceset-ibm-spectrum-fusion-local-0-data-1m227b-block-dmcrypt [ opencount flush ] [16384] (*1) # dm versions [ opencount flush ] [16384] (*1) # dm table ocs-deviceset-ibm-spectrum-fusion-local-0-data-1m227b-block-dmcrypt [ opencount flush securedata ] [16384] (*1) # dm versions [ opencount flush ] [16384] (*1) # Udev cookie 0xd4d9390 (semid 0) created # Udev cookie 0xd4d9390 (semid 0) incremented to 1 # Udev cookie 0xd4d9390 (semid 0) incremented to 2 # Udev cookie 0xd4d9390 (semid 0) assigned to REMOVE task(2) with flags DISABLE_LIBRARY_FALLBACK (0x20) # dm remove ocs-deviceset-ibm-spectrum-fusion-local-0-data-1m227b-block-dmcrypt [ opencount flush retryremove ] [16384] (*1) # Udev cookie 0xd4d9390 (semid 0) decremented to 1 # Udev cookie 0xd4d9390 (semid 0) waiting for zero
- For each of the previously identified nodes, do the
following:
- Find the persistent volume (PV) that need to be deleted
- Run the oc command to find the failed
pv.
oc get pv -l kubernetes.io/hostname=<failed-osds-worker-node-name>
Example output:[root@fu40 ~]# oc get pv -l kubernetes.io/hostname=fu49 NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE local-pv-a2879220 600Gi RWO Delete Released openshift-storage/ocs-deviceset-ibm-spectrum-fusion-local-0-data-1m227b ibm-spectrum-fusion-local 55m
- Delete the released persistent volume (PV)
- Run the oc command to delete the released
pv.
oc delete pv <pv_name>
Example output:[root@fu40 ~]# oc delete pv local-pv-a2879220 persistentvolume "local-pv-a2879220" delete
- Delete any old
- Add new OSD into the node.
Add a new device physically to the node.
- Track the provisioning of persistent volume (PV)s for the devices that match the deviceInclusionSpec.
- It can take a few minutes to provision the PVs. Once the PV is identified, it adds itself to the
cluster automatically.
- lvs
spec
oc -n openshift-local-storage describe localvolumeset ibm-spectrum-fusion-local
Example output:... Spec: Device Inclusion Spec: Device Types: disk part Max Size: 601Gi Min Size: 599Gi Node Selector: Node Selector Terms: Match Expressions: Key: cluster.ocs.openshift.io/openshift-storage Operator: In Values:
- lvs
spec
- Delete the ocs-osd-removal-job
- Run the oc command to delete the
ocs-osd-removal-job.
``` oc delete -n openshift-storage job ocs-osd-removal-job ``` ``` [root@fu40 ~]# oc delete -n openshift-storage job ocs-osd-removal-job job.batch "ocs-osd-removal-job" deleted ```
- Verify that there is a new OSD running
- Verify new OSD pod is running
- Run the oc command to check the new OSD pod is
running.
oc get -n openshift-storage pods -l app=rook-ceph-osd
Example output:[root@fu40 ~]# oc get -n openshift-storage pods -l app=rook-ceph-osd NAME READY STATUS RESTARTS AGE rook-ceph-osd-0-7f99b8ccd5-ssj5w 2/2 Running 0 7m31s <<-- This pod rook-ceph-osd-1-764f9cff48-6gkg9 2/2 Running 0 64m rook-ceph-osd-2-5d9d5984dc-8gkrz 2/2 Running 0 64m
Tip: If the new OSD does not show as Running after a few minutes, restart the rook-ceph-operator pod to force a reconciliation.oc delete pod -n openshift-storage -l app=rook-ceph-operator
- Verify new PVC is created
- Run the oc command to check whether the pods are
running.
oc get pvc -n openshift-storage
Example output:[root@fu40 ~]# oc get pvc -n openshift-storage NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE db-noobaa-db-pg-0 Bound pvc-783036b5-ec40-41a7-91e5-9e179fd24cc3 50Gi RWO ocs-storagecluster-ceph-rbd 65m <<--This one ocs-deviceset-ibm-spectrum-fusion-local-0-data-04vwvq Bound local-pv-b45b1d67 600Gi RWO ibm-spectrum-fusion-local 66m ocs-deviceset-ibm-spectrum-fusion-local-0-data-24nj5t Bound local-pv-c3de9110 600Gi RWO ibm-spectrum-fusion-local 66m ocs-deviceset-ibm-spectrum-fusion-local-0-data-3nsk8j Bound local-pv-1c9f3b11 600Gi RWO ibm-spectrum-fusion-local 34m [root@fu40 ~]# [root@fu40 ~]# oc get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE local-pv-1c9f3b11 600Gi RWO Delete Bound openshift-storage/ocs-deviceset-ibm-spectrum-fusion-local-0-data-3nsk8j ibm-spectrum-fusion-local 10m <<--This one local-pv-b45b1d67 600Gi RWO Delete Bound openshift-storage/ocs-deviceset-ibm-spectrum-fusion-local-0-data-04vwvq ibm-spectrum-fusion-local 68m local-pv-c3de9110 600Gi RWO Delete Bound openshift-storage/ocs-deviceset-ibm-spectrum-fusion-local-0-data-24nj5t ibm-spectrum-fusion-local 68m pvc-783036b5-ec40-41a7-91e5-9e179fd24cc3 50Gi RWO Delete Bound openshift-storage/db-noobaa-db-pg-0 ocs-storagecluster-ceph-rbd 65m
- Verify the OSD Encryption settings
- If cluster wide encryption is enabled, ensure that the crypt keyword is next
to the ocs-deviceset name.
oc debug node/<new-node-name> -- chroot /host lsblk -f
oc debug node/<new-node-name> -- chroot /host dmsetup ls
Example output:[root@fu40 ~]# oc debug node/fu49 -- chroot /host lsblk -f Starting pod/fu49-debug ... To use host binaries, run `chroot /host` NAME FSTYPE LABEL UUID MOUNTPOINT loop1 crypto_LUKS pvc_name=ocs-deviceset-ibm-spectrum-fusion-loca 6a8244eb-55d6-48cc-8e68-33436e512bc6 loop2 crypto_LUKS pvc_name=ocs-deviceset-ibm-spectrum-fusion-loca fa228ec1-0b1d-43ad-8707-9ecd38bfb1f8 sda |-sda1 |-sda2 vfat EFI-SYSTEM A084-4057 |-sda3 ext4 boot 7d757098-d548-4b7b-8c9a-3dd4f34ceca1 /boot `-sda4 xfs root 1cd39805-6936-458d-ae8c-39313bb71c95 /sysroot sdc crypto_LUKS pvc_name=ocs-deviceset-ibm-spectrum-fusion-loca fa228ec1-0b1d-43ad-8707-9ecd38bfb1f8 `-ocs-deviceset-ibm-spectrum-fusion-local-0-data-3nsk8j-block-dmcrypt sr0 Removing debug pod ... [root@fu40 ~]# oc debug node/fu49 -- chroot /host dmsetup ls Starting pod/fu49-debug ... To use host binaries, run `chroot /host` ocs-deviceset-ibm-spectrum-fusion-local-0-data-3nsk8j-block-dmcrypt (253:0) Removing debug pod ...
Note: If verification steps fail, then contact Red Hat support.
- Exit maintenance mode
- Run the oc command to exit maintenance mode after all steps are
completed.
oc label odfclusters.odf.isf.ibm.com -n ibm-spectrum-fusion-ns odfcluster "odf.isf.ibm.com/maintenanceMode-"
Example output:[root@fu40 ~]# oc label odfclusters.odf.isf.ibm.com -n ibm-spectrum-fusion-ns odfcluster "odf.isf.ibm.com/maintenanceMode-" odfcluster.odf.isf.ibm.com/odfcluster unlabeled
- Go to Data foundation page in IBM Storage Fusion user interface and check the health of the Storage cluster in the Health section.
- Set the Red Hat OpenShift Data Foundation cluster to maintenance: