Monitoring placement group sets

Learn and understand about monitoring placement group sets.

When CRUSH assigns placement groups to Ceph OSDs, it looks at the number of replicas for the pool and assigns the placement group to Ceph OSDs such that each replica of the placement group gets assigned to a different Ceph OSD. For example, if the pool requires three replicas of a placement group, CRUSH may assign them to osd.1, osd.2 and osd.3 respectively. CRUSH actually seeks a pseudo-random placement that will take into account failure domains you set in the CRUSH map, so you will rarely see placement groups assigned to nearest neighbor Ceph OSDs in a large cluster. We refer to the set of Ceph OSDs that should contain the replicas of a particular placement group as the Acting Set. In some cases, an OSD in the Acting Set is down or otherwise not able to service requests for objects in the placement group. When these situations arise, do not panic. Common examples include:

  • You added or removed an OSD. Then, CRUSH reassigned the placement group to other Ceph OSDs, thereby changing the composition of the acting set and spawning the migration of data with a "backfill" process.

  • A Ceph OSD was down, was restarted and is now recovering.

  • A Ceph OSD in the acting set is down or unable to service requests, and another Ceph OSD has temporarily assumed its duties.

Ceph processes a client request using the Up Set, which is the set of Ceph OSDs that actually handle the requests. In most cases, the up set and the Acting Set are virtually identical. When they are not, it can indicate that Ceph is migrating data, a Ceph OSD is recovering, or that there is a problem, that is, Ceph usually echoes a HEALTH WARN state with a "stuck stale" message in such scenarios.

Prerequisites

  • A running IBM Storage Ceph cluster.

  • Root-level access to the node.

Procedure

  1. Log into the cephadm shell:

    Example

    [root@host01 ~]# cephadm shell
  2. To retrieve a list of placement groups:

    Example

    [ceph: root@host01 /]# ceph pg dump
  3. View which Ceph OSDs are in the Acting Set or in the Up Set for a given placement group:

    Syntax

    ceph pg map PG_NUM

    Example

    [ceph: root@host01 /]# ceph pg map 128
    Note: If the Up Set and Acting Set do not match, this may be an indicator that the storage cluster rebalancing itself or of a potential problem with the storage cluster.