How To
Summary
This document covers the steps to migrate OSDs and Monitors from VSAN to Passthrough devices.
Environment
Steps
Make sure your cluster is healthy. To migrate to passthrough devices you' must add NVMe/PCIe devices to your worker node VMs.
Note:
- Identify the node
- Mark the node as unschedulable
$ oc adm cordon <node_name>
- Drain the node
$ oc adm drain <node_name> --force --delete-emptydir-data=true --ignore-daemonset - NOTE: This activity might take at least 5 - 10 minutes. Ceph warnings during this period are temporary and are automatically resolved when the node is labeled and returns to service.
- Add NVMe controller, PCIe device, and HDD (for Monitors) as shown below
- Repeat these steps for all nodes that host or may in the future host OSD and Monitor pods.
Preparing the drives
The NVMe devices must not be used for any other purpose. The drives should appear like in Figure 1 - Attached, and Not Consumed
Figure 1: The NVMe device is Attached and Not Consumed. Select the available NVMe device and note the multipath path. In this example it shows:
Path Selection Policy Fixed (VMware) - Preferred Path (vmhba2:C0:T0:L0)
Ensure the SSH service is available. In the host configuration screen, go to System → Services. Find the SSH service in the list and ensure it is Running
Figure 2: The SSH service must be running
Connect to the vSphere host via SSH as the root user and with the password you set during installation. Once connected, execute:
# lspci | grep NVMe
0000:af:00.0 Non-Volatile memory controller: Intel Corporation NVMe Datacenter SSD [3DNAND] 1.6TB 2.5" U.2 (P4600) [vmhba1]
0000:b0:00.0 Non-Volatile memory controller: Intel Corporation NVMe Datacenter SSD [3DNAND] 1.6TB 2.5" U.2 (P4600) [vmhba2]
0000:b1:00.0 Non-Volatile memory controller: Intel Corporation NVMe Datacenter SSD [3DNAND] 1.6TB 2.5" U.2 (P4600) [vmhba3]
0000:b2:00.0 Non-Volatile memory controller: Intel Corporation NVMe Datacenter SSD [3DNAND] 1.6TB 2.5" U.2 (P4600) [vmhba4]
0000:d8:00.0 Non-Volatile memory controller: Intel Corporation NVMe Datacenter SSD [3DNAND] 1.6TB 2.5" U.2 (P4600) [vmhba5]
0000:d9:00.0 Non-Volatile memory controller: Intel Corporation NVMe Datacenter SSD [3DNAND] 1.6TB 2.5" U.2 (P4600) [vmhba6]
Identify the device you noted earlier (in our case vmhba2) and note the PCI location (the first block on the line - in this case 0000:b0:00.0)
Go back to the vCenter UI and still in the Host configuration, scroll to “Hardware” → “PCI Devices”. In that view, click on “Configure Passthrough”. In that list you will find the disk you noted earlier by PCI path
Afterwards, you will see this new device listed as Available (pending) for the passthrough and it will trigger you to restart the hypervisor. Please reboot the hypervisor.
After the hypervisor has been rebooted, the NVMe device should be available, as shown in Figure 5.
Now that the NVMe device is prepared, we must attach it to the VM. For this, the VM must be powered down. Once the VM is off, open the VM settings and add these items:
- Add an NVMe controller - This is optional, but will speed up storage requests in the VM
- Add a PCIe device - the VM needs to be scheduled on the host where your PCI device is present
- Add an HDD devices. The default 50 GB capacity is acceptable but 100 GB is suggested. This will be used for the Ceph Monitor filesystem
Extend the new PCIe / NVMe device as in Figure 6 and click the “Reserve all memory” button. Close the settings and power on the VM.
You can verify that the NVMe device has been successfully added by running lsblk on the VM.
Step 2 : Verify that the devices are added on to the worker nodes:
Execute “lsblk” on all worker nodes to verify device paths for the new NVMe devices and HDDs
$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 60G 0 disk
|-sda1 8:1 0 384M 0 part /boot
|-sda2 8:2 0 127M 0 part /boot/efi
|-sda3 8:3 0 1M 0 part
`-sda4 8:4 0 59.5G 0 part
`-coreos-luks-root-nocrypt 253:0 0 59.5G 0 dm /sysroot
sdb 8:16 0 50G 0 disk
nvme0n1 259:0 0 1.5T 0 disk
To clean NVMe drives, run sgdisk --zap-all /dev/nvmeXnX along with wipefs -a /dev/nvmXnX
Step 4: Create a new project “oc new-project local-storage” and deploy the LSO operator in that namespace via the UI
Step 5: Create local-block and local-fs storage classes using the devices from step 2
% cat <<EOF | oc create -n local-storage -f -
apiVersion: local.storage.openshift.io/v1
kind: LocalVolume
metadata:
name: local-block
namespace: local-storage
spec:
nodeSelector:
nodeSelectorTerms:
- matchExpressions:
- key: cluster.ocs.openshift.io/openshift-storage
operator: Exists
storageClassDevices:
- storageClassName: local-block
volumeMode: Block
devicePaths:
- /dev/nvme0n1
EOF
% cat <<EOF | oc create -n local-storage -f -
apiVersion: local.storage.openshift.io/v1
kind: LocalVolume
metadata:
name: local-fs
namespace: local-storage
spec:
nodeSelector:
nodeSelectorTerms:
- matchExpressions:
- key: cluster.ocs.openshift.io/openshift-storage
operator: Exists
storageClassDevices:
- storageClassName: local-fs
fsType: xfs
volumeMode: Filesystem
devicePaths:
- /dev/sdb
EOF
The oc get pv command should show newly created persistent volumes using “local-fs” and “local-block” storage
% oc get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS VOLUMEATTRIBUTESCLASS REASON AGE
local-pv-13cf69d7 50Gi RWO Delete Available local-fs <unset> 91s
local-pv-4f6a02e6 1490Gi RWO Delete Available local-block <unset> 2m8s
local-pv-955b562b 50Gi RWO Delete Available local-fs <unset> 91s
local-pv-a63109ce 50Gi RWO Delete Available local-fs <unset> 91s
local-pv-dd0166a3 1490Gi RWO Delete Available local-block <unset> 2m9s
local-pv-e3f128be 1490Gi RWO Delete Available local-block
Scale down the rook-ceph-operator deployment and edit the storagecluster definition to add three OSDs with local-block storage
# oc scale deployment rook-ceph-operator --replicas=0 -n openshift-storage
Edit the storagecluster definition to create a new entry under storageDeviceSets:
# oc edit storagecluster -n openshift-storage
- count: 1
dataPVCTemplate:
metadata: {}
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: "1"
storageClassName: local-block
volumeMode: Block
status: {}
name: ocs-deviceset-local
placement: {}
portable: false
preparePlacement: {}
replica: 3
resources: {}
# oc scale deployment rook-ceph-operator --replicas=1 -n openshift-storage
Note: If OSD prepare jobs fail to create new OSDs with the local-block storage class, scale down the rook-ceph-operator pod, delete PVCs of local-block storage and delete all jobs from the openshift-storage namespace, then wipe the devices with sgdisk–zap-all. This should solve the issue upon scaling up the rook-ceph-operator pod
Step 7: Removal of OSDs that use the default storage class, in this example thin-csi-odf
Once the new OSDs built on local-block-storage are up, in, and and rebalancing is complete, remove the old OSDs from thin-csi-odf one by one. Once the old OSDs are removed, remove “thin-csi odf” related content from the storagecluster CR. Note that after removing one OSD it is CRUCIAL to wait for rebalancing (backfill/recovery) to complete before proceeding remove the next old OSD. Do not remove anything if the cluster shows any PGs backfilling, recovering, incomplete, undersized, or down, or if any of the new OSDs are not up and in.
Once all thin-csi-odf OSDs are removed, modify the StorageCluster CR to remove thin-csi-odf
StorageCluster after the removal of thin-csi-odf should look like the below:
# oc edit storagecluster -n openshift-storage
…
storageDeviceSets:
- config: {}
count: 1
dataPVCTemplate:
metadata: {}
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: "1"
storageClassName: local-block
volumeMode: Block
status: {}
name: ocs-deviceset-local
placement: {}
portable: true
preparePlacement: {}
replica: 3
resources: {}
status:
Step 9 : Change the Monitor storage from the thin-csi-odf storage class to local-fs
Follow the procedure in this document: https://access.redhat.com/solutions/6409071
Step 10 : If old OSDs are still present in the output of ceph osd df, remove them manually using the procedure documented here:
Document Location
Worldwide
Was this topic helpful?
Document Information
Modified date:
16 July 2025
UID
ibm17174113