Erasure code pools overview
Ceph uses replicated pools by default, meaning that Ceph copies every object from a primary OSD node to one or more secondary OSDs. The erasure-coded pools reduce the amount of disk space required to ensure data durability but it is computationally a bit more expensive than replication.
Ceph storage strategies involve defining data durability requirements. Data durability means the ability to sustain the loss of one or more OSDs without losing data.
- replicated
- erasure-coded
Erasure coding is a method of storing an object in the Ceph storage cluster durably where the
erasure code algorithm breaks the object into data chunks (k) and coding chunks
(m), and stores those chunks in different OSDs.
k) and
coding (m) chunks from the other OSDs and the erasure code algorithm restores the
object from those chunks.min_size for erasure-coded pools to be
K+1 or more to prevent loss of writes and data. For a 2+2 pool, the
min_size should be k+1 or 3. Erasure coding uses storage capacity more efficiently than replication. The n-replication
approach maintains n copies of an object (3 times by default in Ceph), whereas
erasure coding maintains only k + m chunks. For example, 3 data
and 2 coding chunks use 1.5 times the storage space of the original object.
While erasure coding uses less storage overhead than replication, the erasure code algorithm uses more RAM and CPU than replication when it accesses or recovers objects. Erasure coding is advantageous when data storage must be durable and fault tolerant, but do not require fast read performance (for example, cold storage, historical records, and so on).
For the mathematical and detailed explanation on how erasure code works in Ceph, see the Erasure coding.
.rgw.buckets pool as erasure-coded and all other Ceph Object Gateway pools as
replicated, otherwise an attempt to create a new bucket fails with the following
error:set_req_state_err err_no=95 resorting to 500The reason for this is that erasure-coded pools do not support the
omap operations and certain Ceph Object
Gateway metadata pools require the omap support.