Storage classes

Storage classes define how object data is placed and managed in Ceph Object Gateway (RGW). They map objects to specific placement targets and support cost- and performance-optimized data tiering, especially when used with S3 Bucket Lifecycle transitions.

All placement targets include a STANDARD storage class, which is applied to new objects by default. Users can override this default by setting the default_storage_class value. To store an object in a non-default storage class, include the storage class name in the request header.

S3 protocol: X-Amz-Storage-Class
Swift protocol: X-Object-Storage-Class

S3 Object Lifecycle Management can then be used to move object data between storage classes using Transition actions. When using AWS S3 SDKs, such as boto3, it is important that storage class names match those provided by AWS S3, or else the SDK will drop the request and raise an exception. Moreover, some S3 clients and libraries expect AWS-specific behavior when a storage class named or prefixed with GLACIER is used and thus will fail when accessing Ceph Object Gateway services. For this reason we advise that other storage class names be used with Ceph, including INTELLIGENT-TIERING, STANDARD_IA, REDUCED_REDUNDANCY, and ONEZONE_IA. Custom storage class names like CHEAPNDEEP are accepted by Ceph but might not be by some clients and libraries.

Use cases

Storage classes are commonly used in the following scenarios to optimize data placement, cost, and performance:

Moving infrequently accessed objects to low-cost pools using automated lifecycle transitions.
Assigning latency-sensitive or frequently accessed workloads to faster pools (for example, NVMe-backed pools).
Creating custom storage classes for compliance, isolation, or application-specific data placement (for example, APP_LOGS, ML_DATA).
Automating multi-tier lifecycle transitions such as: STANDARD → STANDARD_IA → archival pool based on age or access patterns.
Applying different durability or resiliency profiles by mapping storage classes to pools with varied replication or erasure coding.
Separating workloads such as analytics, logging, or backup data into pools with optimized compression or cost models.