Bluestore server-side compression

BlueStore applies compression based on pool settings, the configured compression mode, and object characteristics. Each blob (data fragment) is evaluated individually.

How compression is applied depends on pool settings and object access patterns. Object characteristics determine which blob size BlueStore uses, not whether compression is enabled:

If an object is immutable, append-only, or expected to have sequential reads, BlueStore uses compression_max_blob_size.
If the object is expected to have random reads or writes, BlueStore uses compression_min_blob_size.

Benefits

Following are the benefits of using the Bluestore server-side compression feature:

Store more data without adding physical capacity.
Reduce raw disk usage and infrastructure costs.
Enable compression selectively per pool to balance performance and efficiency.

Considerations

Compression introduces CPU overhead and may slightly increase read and write latency. In some cases, small overwrites on compressed objects can increase space usage because overlapping data fragments require both the old and new blobs to be retained. Enable compression only on pools or workloads where data-reduction benefits outweigh performance costs, and avoid enabling it cluster-wide without testing workload patterns.

How compression works

BlueStore applies compression based on pool settings and object characteristics. Each blob (data fragment) is evaluated individually. Compression is applied when an object is marked as immutable or append-only, not expected to have random writes or reads, and expected to have sequential reads. If these conditions are met, BlueStore uses the compression_max_blob_size parameter; otherwise, it uses compression_min_blob_size. Blob size determines how data is divided and aligned before compression. Larger blobs generally improve compression efficiency, while smaller blobs improve read performance because less data must be decompressed.

Figure 1 shows how BlueStore selects a blob size based on object properties.

Figure 1. Bluestore compression workflow

BlueStore does not use a continuous size range; instead, it selects one of the two blob sizes based on allocation hints sent through the rados_set_alloc_hint2 API call. Ceph clients such as RBD, CephFS, and RGW already use these hints to optimize performance.

Compression behavior

BlueStore compresses each write operation according to its configured mode, ratio threshold, and blob-size limits. In nonemode, compression is never applied. In passive mode, it occurs only if the client provides a compressible hint. In aggressive mode, BlueStore compresses unless the client marks data as incompressible. In force mode, BlueStore always attempts to compress data.

The compression_required_ratio parameter controls the minimum acceptable ratio. For example, if set to 0.7, compression occurs only when the resulting data is 70 percent or smaller than the original. If the threshold is not met, the data is stored uncompressed. This check is performed for each blob individually to prevent unnecessary CPU usage and space amplification.

Blob size is governed by compression_min_blob_size and compression_max_blob_size, and BlueStore supports the snappy, zlib, lz4, and zstd algorithms. When configured appropriately, BlueStore balances capacity savings with performance efficiency across IBM Storage Ceph clusters.