Identifying stuck placement groups
A placement group is not necessarily problematic just because it is not in a active+clean state. Generally, Ceph’s ability to self repair might not be working when placement groups get stuck.
Before you begin
About this task
The stuck states include unclean, inactive, and stale.
- Unclean
- Placement groups contain objects that are not replicated the desired number of times. They should be recovering.
- Inactive
- Placement groups cannot process reads or writes because they are waiting for an OSD with the most up-to-date data to come back
up. - Stale
- Placement groups are in an unknown state, because the OSDs that host them have not reported to the monitor cluster in a while, and can be configured with the
mon osd report timeoutsetting.
Procedure
Identify the stuck placement group by running the pg dump_stuck command.
ceph pg dump_stuck {inactive|unclean|stale|undersized|degraded [inactive|unclean|stale|undersized|degraded...]} {INT}
For example,
[ceph: root@host01 /]# ceph pg dump_stuck stale OK