Finding orphan and leaky objects

A healthy storage cluster does not have any orphan or leaky objects, but in some cases orphan or leaky objects can occur.

An orphan object exists in a storage cluster and has an object ID associated with the RADOS object. However, there is no reference of the RADOS object with the S3 object in the bucket index reference.

For example, if the Ceph Object Gateway goes down in the middle of an operation, this can cause some objects to become orphans. Also, an undiscovered bug can cause orphan objects to occur.

You can see how the Ceph Object Gateway objects map to the RADOS objects. The radosgw-admin command provides a tool to search for and produce a list of these potential orphan or leaky objects. Using the radoslist subcommand displays objects stored within buckets, or all buckets in the storage cluster. The rgw-orphan-list script displays orphan objects within a pool.

Note: The radoslist subcommand is replacing the deprecated orphans find and orphans finish subcommands.
Important: Do not use this command where Indexless buckets are in use as all the objects appear as orphaned.
Important: Another alternate way to identity orphaned objects is to run the rados -p <pool> ls | grep BUCKET_ID command.

Prerequisites

  • A running IBM Storage Ceph cluster.

  • A running Ceph Object Gateway.

Procedure

  1. Generate a list of objects that hold data within a bucket.

    Syntax

    radosgw-admin bucket radoslist --bucket BUCKET_NAME

    Example

    [root@host01 ~]# radosgw-admin bucket radoslist --bucket mybucket
    Note: If the BUCKET_NAME is omitted, then all objects in all buckets are displayed.
  2. Check the version of rgw-orphan-list.

    Example

    [root@host01 ~]# head /usr/bin/rgw-orphan-list

    The version should be 2023-01-11 or newer.

  3. Create a directory where you need to generate the list of orphans.

    Example

     [root@host01 ~]# mkdir orphans
  4. Navigate to the directory created earlier.

    Example

     [root@host01 ~]# cd orphans
  5. From the pool list, select the pool in which you want to find orphans. This script might run for a long time depending on the objects in the cluster.

    Example

     [root@host01 orphans]# rgw-orphan-list

    Example

     Available pools:
         .rgw.root
         default.rgw.control
         default.rgw.meta
         default.rgw.log
         default.rgw.buckets.index
         default.rgw.buckets.data
         rbd
         default.rgw.buckets.non-ec
         ma.rgw.control
         ma.rgw.meta
         ma.rgw.log
         ma.rgw.buckets.index
         ma.rgw.buckets.data
         ma.rgw.buckets.non-ec
     Which pool do you want to search for orphans?

    Enter the pool name to search for orphans.

    Important: A data pool must be specified when using the rgw-orphan-list command, and not a metadata pool.
  6. View the details of the rgw-orphan-list tool usage.

    Syntax

     rgw-orphan-list -h
     rgw-orphan-list POOL_NAME /DIRECTORY

    Example

     [root@host01 orphans]# rgw-orphan-list default.rgw.buckets.data /orphans
    
     2023-09-12 08:41:14 ceph-host01 Computing delta...
     2023-09-12 08:41:14 ceph-host01 Computing results...
     10 potential orphans found out of a possible 2412 (0%).         <<<<<<< orphans detected
     The results can be found in './orphan-list-20230912124113.out'.
         Intermediate files are './rados-20230912124113.intermediate' and './radosgw-admin-20230912124113.intermediate'.
     ***
     *** WARNING: This is EXPERIMENTAL code and the results should be used
     ***          only with CAUTION!
     ***
     Done at 2023-09-12 08:41:14.
  7. Run the ls -l command to verify the files ending with error should be zero length indicating the script ran without any issues.

    Example

     [root@host01 orphans]# ls -l
    
     -rw-r--r--. 1 root root    770 Sep 12 03:59 orphan-list-20230912075939.out
     -rw-r--r--. 1 root root      0 Sep 12 03:59 rados-20230912075939.error 
     -rw-r--r--. 1 root root 248508 Sep 12 03:59 rados-20230912075939.intermediate
     -rw-r--r--. 1 root root      0 Sep 12 03:59 rados-20230912075939.issues
     -rw-r--r--. 1 root root      0 Sep 12 03:59 radosgw-admin-20230912075939.error
     -rw-r--r--. 1 root root 247738 Sep 12 03:59 radosgw-admin-20230912075939.intermediate
  8. Review the orphan objects listed.

    Example

     [root@host01 orphans]# cat ./orphan-list-20230912124113.out
    
     a9c042bc-be24-412c-9052-dda6b2f01f55.16749.1_key1.cherylf.433-bucky-4865-0.0
     a9c042bc-be24-412c-9052-dda6b2f01f55.16749.1_key1.cherylf.433-bucky-4865-0.1
     a9c042bc-be24-412c-9052-dda6b2f01f55.16749.1_key1.cherylf.433-bucky-4865-0.2
     a9c042bc-be24-412c-9052-dda6b2f01f55.16749.1_key1.cherylf.433-bucky-4865-0.3
     a9c042bc-be24-412c-9052-dda6b2f01f55.16749.1_key1.cherylf.433-bucky-4865-0.4
     a9c042bc-be24-412c-9052-dda6b2f01f55.16749.1_key1.cherylf.433-bucky-4865-0.5
     a9c042bc-be24-412c-9052-dda6b2f01f55.16749.1_key1.cherylf.433-bucky-4865-0.6
     a9c042bc-be24-412c-9052-dda6b2f01f55.16749.1_key1.cherylf.433-bucky-4865-0.7
     a9c042bc-be24-412c-9052-dda6b2f01f55.16749.1_key1.cherylf.433-bucky-4865-0.8
     a9c042bc-be24-412c-9052-dda6b2f01f55.16749.1_key1.cherylf.433-bucky-4865-0.9
  9. Remove orphan objects:

    Syntax

     rados -p POOL_NAME rm OBJECT_NAME

    Example

     [root@host01 orphans]# rados -p default.rgw.buckets.data rm myobject
    Warning: Verify you are removing the correct objects. Running the rados rm command removes data from the storage cluster.