Enabling compression
The Ceph Object Gateway supports server-side compression of uploaded objects using any of Ceph’s compression plugins. These include:
-
zlib: Supported. -
snappy: Supported. -
zstd: Supported.
Configuration
To enable compression on a zone’s placement target, provide the
--compression=TYPE option to the radosgw-admin zone placement
modify command. The compression TYPE refers to the name of the compression
plugin to use when writing new object data.
Each compressed object stores the compression type. Changing the setting does not hinder the ability to decompress existing compressed objects, nor does it force the Ceph Object Gateway to re-compress existing objects.
This compression setting applies to all new objects uploaded to buckets using this placement target.
To disable compression on a zone’s placement target, provide the
--compression=TYPE option to the radosgw-admin zone placement
modify command and specify an empty string or none.
Example
[root@host01 ~] radosgw-admin zone placement modify --rgw-zone=default --placement-id=default-placement --compression=zlib
{
...
"placement_pools": [
{
"key": "default-placement",
"val": {
"index_pool": "default.rgw.buckets.index",
"data_pool": "default.rgw.buckets.data",
"data_extra_pool": "default.rgw.buckets.non-ec",
"index_type": 0,
"compression": "zlib"
}
}
],
...
}
After enabling or disabling compression, restart the Ceph Object Gateway instance so the change will take effect.
default zone and a set of pools. For production
deployments, see Creating a realm.Statistics
While all existing commands and APIs continue to report object and bucket sizes based on their
uncompressed data, the radosgw-admin bucket stats command includes compression
statistics for all buckets.
radosgw-admin bucket stats command are: - rgw.mainrefers to regular entries or objects.-
rgw.multimetarefers to the metadata of incomplete multipart uploads. rgw.cloudtieredrefers to objects that a lifecycle policy has transitioned to a cloud tier. When configured withretain_head_object=true, a head object is left behind that no longer contains data, but can still serve the object's metadata with HeadObject requests. These stub head objects use thergw.cloudtieredcategory. See Transitioning data to Amazon S3 cloud service for more information.
Syntax
radosgw-admin bucket stats --bucket=BUCKET_NAME
{
...
"usage": {
"rgw.main": {
"size": 1075028,
"size_actual": 1331200,
"size_utilized": 592035,
"size_kb": 1050,
"size_kb_actual": 1300,
"size_kb_utilized": 579,
"num_objects": 104
}
},
...
}
The size is the accumulated size of the objects in the bucket, uncompressed and
unencrypted. The size_kb is the accumulated size in kilobytes and is calculated as
ceiling(size/1024). In this example, it is ceiling(1075028/1024) =
1050.
The size_actual is the accumulated size of all the objects after each object is
distributed in a set of 4096-byte blocks. If a bucket has two objects, one of size 4100 bytes and
the other of 8500 bytes, the first object is rounded up to 8192 bytes, and the second one rounded
12288 bytes, and their total for the bucket is 20480 bytes. The size_kb_actual is
the actual size in kilobytes and is calculated as size_actual/1024. In this
example, it is 1331200/1024 = 1300.
The size_utilized is the total size of the data in bytes after it has been
compressed and/or encrypted. Encryption could increase the size of the object while compression
could decrease it. The size_kb_utilized is the total size in kilobytes and is
calculated as ceiling(size_utilized/1024). In this example, it is
ceiling(592035/1024)= 579.
Here, all the sizes in kilobytes is rounded up with the ceiling function.