Block device caching options
The user space implementation of the Ceph block device, that is,
librbd, cannot take advantage of the Linux page cache, so it includes
its own in-memory caching, called RBD caching. Ceph block device
caching behaves just like well-behaved hard disk caching. When the
operating system sends a barrier or a flush request, all dirty data is
written to the Ceph OSDs. This means that using write-back caching is
just as safe as using a well-behaved physical hard disk with a virtual
machine that properly sends flushes, that is, Linux kernel version
2.6.32 or higher. The cache uses a Least Recently Used (LRU) algorithm,
and in write-back mode it can coalesce contiguous requests for better
throughput.
Ceph block devices support write-back caching. To enable write-back
caching, set rbd_cache = true to the [client] section of the Ceph
configuration file. By default, librbd does not perform any caching.
Writes and reads go directly to the storage cluster, and writes return
only when the data is on disk on all replicas. With caching enabled,
writes return immediately, unless there are more than
rbd_cache_max_dirty unflushed bytes. In this case, the write triggers
write-back and blocks until enough bytes are flushed.
Ceph block devices support write-through caching. You can set the size
of the cache, and you can set targets and limits to switch from
write-back caching to write-through caching. To enable write-through
mode, set rbd_cache_max_dirty to 0. This means writes return only when
the data is on disk on all replicas, but reads may come from the cache.
The cache is in memory on the client, and each Ceph block device image
has its own. Since the cache is local to the client, there is no
coherency if there are others accessing the image. Running other file
systems, such as GFS or OCFS, on top of Ceph block devices will not work
with caching enabled.
The Ceph configuration settings for Ceph block devices must be set in
the [client] section of the Ceph configuration file, by default,
/etc/ceph/ceph.conf.
The settings include:
rbd_cache
Description
Enable caching for RADOS Block Device (RBD).
Type Boolean
Required
No
Defaulttrue
rbd_cache_size
Description The RBD cache size in bytes.
Type
64-bit Integer
Required
No
Default32 MiB
rbd_cache_max_dirty
Description
The dirty limit in bytes at which the cache triggers write-back. If
0, uses write-through caching.
Type
64-bit Integer
Required No
Constraint
Must be less than rbd cache size.
Default24 MiB
rbd_cache_target_dirty
Description
The dirty target before the cache begins writing data to the data
storage. Does not block writes to the cache.
Type
64-bit Integer
Required
No
Constraint
Must be less than rbd cache max dirty.
Default16 MiB
rbd_cache_max_dirty_age
Description
The number of seconds dirty data is in the cache before writeback
starts.
Type Float
Required
No
Default1.0
rbd_cache_max_dirty_object
Description
The dirty limit for objects - set to 0 for auto calculate from
rbd_cache_size.
Type
Integer
Default0
rbd_cache_block_writes_upfront
Description
If true, it will block writes to the cache before the aio_write call
completes. If false, it will block before the aio_completion is
called.
Type Boolean
Defaultfalse
rbd_cache_writethrough_until_flush
Description
Start out in write-through mode, and switch to write-back after the
first flush request is received. Enabling this is a conservative but
safe setting in case VMs running on rbd are too old to send flushes,
like the virtio driver in Linux before 2.6.32.
Type
Boolean
Required
No
Defaulttrue