Caching options

The user space implementation of the Ceph Block Device, that is, librbd, cannot take advantage of the Linux page cache, so it includes its own in-memory caching, called RBD caching. Ceph Block Device caching behaves just like well-behaved hard disk caching. When the operating system sends a barrier or a flush request, all dirty data is written to the Ceph OSDs. This means that using write-back caching is just as safe as using a well-behaved physical hard disk with a virtual machine that properly sends flushes, that is, Linux kernel version 2.6.32 or higher. The cache uses a Least Recently Used (LRU) algorithm, and in write-back mode it can coalesce contiguous requests for better throughput.

Ceph Block Devices support write-back caching. To enable write-back caching, set rbd_cache = true to the [client] section of the Ceph configuration file. By default, librbd does not perform any caching. Writes and reads go directly to the storage cluster, and writes return only when the data is on disk on all replicas. With caching enabled, writes return immediately, unless there are more than rbd_cache_max_dirty unflushed bytes. In this case, the write triggers write-back and blocks until enough bytes are flushed.

Ceph Block Devices support write-through caching. You can set the size of the cache, and you can set targets and limits to switch from write-back caching to write-through caching. To enable write-through mode, set rbd_cache_max_dirty to 0. This means writes return only when the data is on disk on all replicas, but reads may come from the cache. The cache is in memory on the client, and each Ceph Block Device image has its own. Since the cache is local to the client, there is no coherency if there are others accessing the image. Running other file systems, such as GFS or OCFS, on top of Ceph Block Devices will not work with caching enabled.

The Ceph configuration settings for Ceph Block Devices must be set in the [client] section of the Ceph configuration file, by default, /etc/ceph/ceph.conf.

The settings include:

rbd_cache

Description
Enable caching for RADOS Block Device (RBD).

Type Boolean

Required
No

Default
true

rbd_cache_size

Description The RBD cache size in bytes.

Type
64-bit Integer

Required
No

Default
32 MiB

rbd_cache_max_dirty

Description
The dirty limit in bytes at which the cache triggers write-back. If 0, uses write-through caching.

Type
64-bit Integer

Required No

Constraint Must be less than rbd cache size.

Default
24 MiB

rbd_cache_target_dirty

Description
The dirty target before the cache begins writing data to the data storage. Does not block writes to the cache.

Type
64-bit Integer

Required
No

Constraint
Must be less than rbd cache max dirty.

Default
16 MiB

rbd_cache_max_dirty_age

Description
The number of seconds dirty data is in the cache before writeback starts.

Type Float

Required
No

Default
1.0

rbd_cache_max_dirty_object

Description The dirty limit for objects - set to 0 for auto calculate from rbd_cache_size.

Type
Integer

Default
0

rbd_cache_block_writes_upfront

Description
If true, it will block writes to the cache before the aio_write call completes. If false, it will block before the aio_completion is called.

Type Boolean

Default
false

rbd_cache_writethrough_until_flush

Description
Start out in write-through mode, and switch to write-back after the first flush request is received. Enabling this is a conservative but safe setting in case VMs running on rbd are too old to send flushes, like the virtio driver in Linux before 2.6.32.

Type
Boolean

Required
No

Default
true