Troubleshooting
Problem
- Pools are utilizing a crush rule that references a device class that is not currently utilized in the cluster, this results in PGs being in an unknown, misplaced, and/or remapped state
- More than 100 percent of objects are misplaced in Ceph after upgrading the OpenShift Data Foundations operator to 4.16.x
- Rook created crush rules to utilize a device class that is not used in the cluster; these crush rules are now applied to pools, causing issues
- The defaultCephDeviceClass in the StorageCluster CR is incorrect
- Transitioned from unsupported HDDs to SSDs in ODF, resulting in data unavailability
Example:
Ceph status shows a large amount of objects misplaced
$ oc exec -it $(oc get pod -n openshift-storage -l app=rook-ceph-operator -o name) -n openshift-storage -- ceph status -c /var/lib/rook/openshift-storage/openshift-storage.config
cluster:
id: [REDACTED]
health: HEALTH_OK
services:
mon: 3 daemons, quorum a,b,d (age 7d)
mgr: b(active, since 7d), standbys: a
mds: 1/1 daemons up, 1 hot standby
osd: 6 osds: 6 up (since 7d), 6 in (since 4M); 169 remapped pgs
rgw: 1 daemon active (1 hosts, 1 zones)
data:
volumes: 0/1 healthy, 1 recovering
pools: 12 pools, 281 pgs
objects: 184.82k objects, 265 GiB
usage: 1.2 TiB used, 17 TiB / 18 TiB avail
pgs: 33.808% pgs unknown
886324/554472 objects misplaced (159.850%)
138 active+clean+remapped
95 unknown
48 active+undersized+remappedAll OSDs are utilizing the device class ssd
$ oc exec -it $(oc get pod -n openshift-storage -l app=rook-ceph-operator -o name) -n openshift-storage -- ceph osd tree -c /var/lib/rook/openshift-storage/openshift-storage.config
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 5.85956 root default
-5 1.95319 host [REDACTED]
1 ssd 0.97659 osd.1 up 1.00000 1.00000
5 ssd 0.97659 osd.5 up 1.00000 1.00000
-7 1.95319 host [REDACTED]
2 ssd 0.97659 osd.2 up 1.00000 1.00000
4 ssd 0.97659 osd.4 up 1.00000 1.00000
-3 1.95319 host [REDACTED]
0 ssd 0.97659 osd.0 up 1.00000 1.00000
3 ssd 0.97659 osd.3 up 1.00000 1.00000
The pools are utilizing a crush rule that references HDD, while there are only SSDs in the cluster
$ oc exec -it $(oc get pod -n openshift-storage -l app=rook-ceph-operator -o name) -n openshift-storage -- ceph osd pool ls detail -c /var/lib/rook/openshift-storage/openshift-storage.config
pool 1 'ocs-storagecluster-cephblockpool' replicated size 3 min_size 2 crush_rule 25 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 9437 lfor 0/0/30 flags hashpspool,selfmanaged_snaps stripe_width 0 target_size_ratio 0.49 application rbdThe crush rules being utilized reference HDDs, while there are only SSDs in the cluster
$ oc exec -it $(oc get pod -n openshift-storage -l app=rook-ceph-operator -o name) -n openshift-storage -- ceph osd crush rule dump -c /var/lib/rook/openshift-storage/openshift-storage.config
{
"rule_id": 25,
"rule_name": "ocs-storagecluster-cephblockpool_host_hdd",
"type": 1,
"steps": [
{
"op": "take",
"item": -2,
"item_name": "default~hdd"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "host"
},
{
"op": "emit"
}
]
},
Cause
When upgrading the OpenShift Data Foundations operator to 4.16.x, new crush rules are created for the Ceph pools. These new crush rules are device class specific. The pools are now utilizing a crush rule with a device class that is not being utilized in the cluster, resulting in PGs being in an unknown, misplaced, and/or remapped state
The default behavior of Rook when creating these crush rules is to choose the first item in the list from the command $ ceph osd crush class ls. This causes issues as customers who have transitioned from the unsupported HDDs to SSDs may still have HDDs in their list.
This issue is currently being tracked through a Jira.
Artifacts
| Product/Version | Related BZ/Jira | Errata | Fixed Version |
|---|---|---|---|
| ODF/4.19 | Jira DFBUGS-948 | Errata N/A | 4.19.0 |
| ODF/4.18 | Jira DFBUGS-1666 | Errata N/A | 4.18.1 |
| ODF/4.17 | Jira DFBUGS-1667 | Errata RHSA-2025:17145 | 4.17.14 |
| ODF/4.16 | Jira DFBUGS-1668 | Errata RHBA-2025:17157 | 4.16.16 |
Environment
IBM Storage Fusion Data Foundation (FDF) 4.x
Red Hat OpenShift Data Foundation (ODF) 4.x
Diagnosing The Problem
- The Ceph command
$ ceph osd crush class lshas two items in the list, with ssd not being the first in the list
$ oc exec -it $(oc get pod -n openshift-storage -l app=rook-ceph-operator -o name) -n openshift-storage -- ceph osd crush class ls -c /var/lib/rook/openshift-storage/openshift-storage.config
[
"hdd",
"ssd"
]
- The storagecluster CR has hdd as the
defaultCephDeviceClass
$ oc get storagecluster -o json | jq '.items[].status.defaultCephDeviceClass'
"hdd"
- The CephCluster CR has the device class hdd and ssd in the
deviceClassessection. Notably hdd is listed first
$ oc get cephcluster -o json | jq '.items[].status.storage.deviceClasses
[
{
"name": "hdd"
"name": "ssd"
}
]Resolving The Problem
For the Ceph commands, use KCS Configuring the Rook-Ceph Toolbox in Data Foundation 4.x or KCS Accessing the Ceph Storage CLI in Data Foundation 4.x.
The following steps involve the deletion of crush rules and crush classes. It is recommended to open a support case with Red Hat before applying this solution to minimize disruptions to your environment.
1. Switch to the openshift-storage project and scale down the rook-ceph-operator and ocs-operator:
$ oc project openshift-storage; oc scale deployment rook-ceph-operator ocs-operator --replicas=0
2. Switch to Ceph Tools and make note of HDD OSDs:
$ ceph osd df | grep -w hdd ## Save output for step 11
3. Set the flags nobackfill, norecover, and norebalance
$ ceph osd set nobackfill
$ ceph osd set norecover
$ ceph osd set norebalance
NOTE: From the step below until the latest step the data will be unavailable for read and write
4. Change all pools to utilize an generic Crush Rule which does not reference a device class, hdd or ssd.
In this case we are utilizing the crush rule replicated_rule as it doesn't have a device class specified.
NOTE: Do not use the the for loop if you have custom pools with custom crush rules
$ ceph osd crush rule ls
$ for i in $(ceph osd pool ls); do ceph osd pool set $i crush_rule replicated_rule; done
5. Attempt to delete the hdd crush class NOTE: This fails as we have crush rules that reference hdd, make note of that list
$ ceph osd crush class ls
$ ceph osd crush class rm hdd
6. Delete the old crush rules one by one that reference hdd (Utilize the rules noted in the previous step)
$ ceph osd crush rule rm <crush_rules_hdd>7. Remove the old crush class
$ ceph osd crush class rm hdd8. Scale up the rook-ceph-operator and ocs-operator
$ oc scale deployment rook-ceph-operator ocs-operator --replicas=19. Remove the default defaultCephDeviceClass parameter using patch command:
$ oc patch storagecluster ocs-storagecluster -n openshift-storage --type=json --subresource=status --patch '[{"op": "remove", "path": "/status/defaultCephDeviceClass"}]'10. Wait five to ten minutes to allow the operators time to reconcile
$ sleep 300
11. Verify new rules have been created in Ceph that reference the device class ssd and the pools are utilizing these new crush rules
$ ceph osd pool ls detail
$ ceph osd crush rule dump12. Change OSDs which are HDD Clas and unset nobackfill/norecover/norebalance:
You will need the output from step 2.
$ ceph osd crush rm-device-class osd.{num}; ceph osd crush set-device-class ssd osd.{num} ## Repeat for all OSDs seen as an HDD.
$ ceph osd unset nobackfill
$ ceph osd unset norecover
$ ceph osd unset norebalanceNOTE: Now you will need to monitor the status until the Rebuild is finished
13. Verify the health of Ceph
$ ceph statusDocument Location
Worldwide
Was this topic helpful?
Document Information
Modified date:
05 May 2026
UID
ibm17249971