Inconsistent placement groups

Understand and troubleshoot inconsistent placement groups (PGs).

Some placement groups are marked as active + clean + inconsistent and the ceph health detail command returns an error message similar to the following example:
HEALTH_ERR 1 pgs inconsistent; 2 scrub errors
pg 0.6 is active+clean+inconsistent, acting [0,1,2]
2 scrub errors

What this means

When Ceph detects inconsistencies in one or more replicas of an object in a placement group, it marks the placement group as inconsistent. The most common inconsistencies are:
  • Objects have an incorrect size.
  • Objects are missing from one replica after a recovery finished.

In most cases, errors during scrubbing cause inconsistency within placement groups.

Troubleshooting this problem

  1. Log in to the Cephadm shell.
    cephadm shell
  2. Use the ceph health detail command to determine which placement group is in the inconsistent state.
    [ceph: root@host01 /]# ceph health detail
    HEALTH_ERR 1 pgs inconsistent; 2 scrub errors
    pg 0.6 is active+clean+inconsistent, acting [0,1,2]
    2 scrub errors
  3. Determine why the placement group is inconsistent.
    1. Start the deep scrubbing process on the placement group.
      ceph pg deep-scrub ID
      Replace ID with the ID of the inconsistent placement group.
      For example,
      [ceph: root@host01 /]# ceph pg deep-scrub 0.6
      instructing pg 0.6 on osd.0 to deep-scrub
    2. Use the ceph -w command to find messages related to the inconsistent placement group.
      ceph -w | grep ID

      Replace ID with the ID of the inconsistent placement group.

      For example,
      [ceph: root@host01 /]# ceph -w | grep 0.6
      2022-05-26 01:35:36.778215 osd.106 [ERR] 0.6 deep-scrub stat mismatch, got 636/635 objects, 0/0 clones, 0/0 dirty, 0/0 omap, 0/0 hit_set_archive, 0/0 whiteouts, 1855455/1854371 bytes.
      2022-05-26 01:35:36.788334 osd.106 [ERR] 0.6 deep-scrub 1 errors
  4. If the output includes any error messages similar to the following ones, you can repair the inconsistent placement group.

    Syntax

    PG.ID shard OSD: soid OBJECT missing attr , missing attr ATTRIBUTE_TYPE
    PG.ID shard OSD: soid OBJECT digest 0 != known digest DIGEST, size 0 != known size SIZE
    PG.ID shard OSD: soid OBJECT size 0 != known size SIZE
    PG.ID deep-scrub stat mismatch, got MISMATCH
    PG.ID shard OSD: soid OBJECT candidate had a read error, digest 0 != known digest DIGEST
  • If the output includes any error messages similar to the following ones, you can repair the inconsistent placement group.
    PG.ID shard OSD: soid OBJECT missing attr , missing attr ATTRIBUTE_TYPE
    PG.ID shard OSD: soid OBJECT digest 0 != known digest DIGEST, size 0 != known size SIZE
    PG.ID shard OSD: soid OBJECT size 0 != known size SIZE
    PG.ID deep-scrub stat mismatch, got MISMATCH
    PG.ID shard OSD: soid OBJECT candidate had a read error, digest 0 != known digest DIGEST
  • If the output includes any error messages similar to the following ones, it is not safe to repair the inconsistent placement group because you can lose data.
    PG.ID shard OSD: soid OBJECT digest DIGEST != known digest DIGEST
    PG.ID shard OSD: soid OBJECT omap_digest DIGEST != known omap_digest DIGEST
    If this occurs, open a support ticket. For more information, contact IBM Support.