Replacing the OSDs
When disks fail, you can replace the physical storage device and reuse the same OSD ID to avoid having to reconfigure the CRUSH map.
destroyed flag. This flag is used to determine the OSD IDs that can be reused in the next OSD deployment. The destroyed flag is used to determine which OSD id is reused in the next OSD deployment.If you use OSD specification for deployment, your newly added disk is assigned the OSD ID of their replaced counterparts.
Prerequisites
-
A running IBM Storage Ceph cluster.
-
Hosts are added to the cluster.
-
Monitor, Manager, and OSD daemons are deployed on the storage cluster.
Procedure
-
Log into the
cephadmshell:Example
[root@host01 ~]# cephadm shell -
Check the device and the node from which the OSD has to be replaced:
Example
[ceph: root@host01 /]# ceph osd tree -
Set the OSD service specification to unmanaged state:Note: Set the unmanaged parameter to
truein the OSD service specification before removal. If you skip this step,may automatically redeploy OSDs, causing conflicts between orchestrator-driven deployment and OSD removal.cephadmExample
service_type:osd service_id:<service_id> unmanaged:true -
Apply the updated specification:
ceph orch apply -i <osds.yaml> - Run the orch ls ceph command to confirm that the OSD service is set to the unmanaged state.
-
Replace the OSD:Important: If the storage cluster has health_warn or other errors associated with it, check and try to fix any errors before replacing the OSD to avoid data loss.
Syntax
ceph orch osd rm OSD_ID --replaceExample
[ceph: root@host01 /]# ceph orch osd rm 0 --replace -
Check the status of the OSD replacement:
Example
[ceph: root@host01 /]# ceph orch osd rm status - Set the OSD service specification back to managed state. After you replace the disk, set the OSD service back to a managed state so that
cephadmcan resume orchestration.-
Update the OSD service specification.
unmanaged: false - Apply the updated specification. This step allows
cephadmto redeploy the OSD on the new device while reusing the existing OSD ID.ceph orch apply -i <osds.yaml>
-
Verification
-
Verify the details of the devices and the nodes from which the Ceph OSDs are replaced:
Example
[ceph: root@host01 /]# ceph osd treeYou will see an OSD with the same id as the one you replaced running on the same host.