Caching options
The user space implementation of the Ceph Block Device, librbd, cannot
take advantage of the Linux page cache, so it includes its own in-memory caching, called RBD
caching. Ceph Block Device caching behaves just like well-behaved hard disk
caching.
When the operating system sends a barrier or a flush request, all dirty data is written to the Ceph OSDs. This means that using write-back caching is as safe as using a well-behaved physical hard disk with a virtual machine that properly sends flushes, that is, Linux kernel version 2.6.32 or higher. The cache uses a Least Recently Used (LRU) algorithm, and in write-back mode it can coalesce contiguous requests for better throughput.
Ceph Block Devices support write-back caching. To enable write-back caching, set
rbd_cache = true to the [client] section of the Ceph configuration
file. By default, librbd does not perform any caching. Writes and reads go directly
to the storage cluster, and writes return only when the data is on disk on all replicas. With
caching enabled, writes return immediately, unless there are more than
rbd_cache_max_dirty unflushed bytes. In this case, the write triggers a write-back
and blocks until enough bytes are flushed.
Ceph Block Devices support write-through caching. You can set the size of the cache, and you can
set targets and limits to switch from write-back caching to write-through caching. To enable
write-through mode, set rbd_cache_max_dirty to 0. This means
writes return only when the data is on disk on all replicas, but reads can come from the cache. The
cache is in memory on the client, and each Ceph Block Device image has its own. Since the cache is
local to the client, there is no coherency if there are others accessing the image. Running other
file systems, such as GFS or OCFS, on top of Ceph Block Devices will not work with caching
enabled.
The Ceph configuration settings for Ceph Block Devices must be set in the
[client] section of the Ceph configuration file, by default
/etc/ceph/ceph.conf.
| Option | Description | Type | Required | Constraint | Default |
|---|---|---|---|---|---|
rbd_cache |
Enable caching for RADOS Block Device (RBD). | Boolean | No | N/A | true |
rbd_cache_size |
The RADOS Block Devices (RBD) cache size in bytes. | 64-bit integer | No | N/A | 32 MiB |
rbd_cache_max_dirty |
The dirty limit in bytes at which the cache triggers a write-back. If
0, uses write-through caching. |
64-bit integer | No | Must be less than rbd cache size. |
24 MiB |
rbd_cache_target_dirty |
The dirty target before the cache begins writing data to the data storage.
Does not block writes to the cache. |
64-bit integer | No | Must be less than rbd cache max dirty. |
16 MiB |
rbd_cache_max_dirty_age |
The number of seconds dirty data is in the cache before writeback starts. | Float | No | N/A | 1.0 |
rbd_cache_max_dirty_object |
The dirty limit for objects. Set to 0 to auto calculate from
rbd_cache_size. |
Integer | N/A | N/A | 0 |
rbd_cache_block_writes_upfront |
If true, it will block writes to the cache before the
aio_write call completes. If false, it blocks before the
aio_completion is called. |
Boolean | N/A | N/A | false |
rbd_cache_writethrough_until_flush |
Start out in write-through mode, and switch to write-back after the first flush request is received. Enabling this is a conservative but safe setting in case VMs running on rbd are too old to send flushes, like the virtio driver in Linux before 2.6.32. | Boolean | No | N/A | true |