Erasure code profiles
Ceph defines an erasure-coded pool with a profile. Ceph uses a profile when creating an erasure-coded pool and the associated CRUSH rule.
Ceph creates a default erasure code profile when initializing a cluster and it provides the same level of redundancy as two copies in a replicated pool. This default profile defines k=2 and m=2, meaning Ceph spreads the object data over four OSDs (k+m=4) and Ceph can lose one of those OSDs without losing data. EC 2+2 requires a minimum deployment footprint of 4 nodes (5 nodes recommended) and can cope with the temporary loss of 1 OSD nodes.
To display the default profile use the following command:
$ ceph osd erasure-code-profile get default
k=2
m=2
plugin=jerasure
technique=reed_sol_van
You can create a new profile to improve redundancy without increasing raw storage requirements. For instance, a profile with k=8 and m=4 can sustain the loss of four (m=4) OSDs by distributing an object on 12 (k+m=12) OSDs. Ceph divides the object into 8 chunks and computes 4 coding chunks for recovery. For example, if the object size is 8 MB, each data chunk is 1 MB and each coding chunk has the same size as the data chunk, that is also 1 MB. The object is not lost even if four OSDs fail simultaneously.
The most important parameters of the profile are k, m, and crush-failure-domain, because they define the storage overhead and the data durability.
For instance, if the desired architecture must sustain the loss of two racks with a storage overhead of 40% overhead, the following profile can be defined:
$ ceph osd erasure-code-profile set myprofile \
k=4 \
m=2 \
crush-failure-domain=rack
$ ceph osd pool create ecpool 12 12 erasure *myprofile*
$ echo ABCDEFGHIJKL | rados --pool ecpool put NYAN -
$ rados --pool ecpool get NYAN -
ABCDEFGHIJKL
The primary OSD divides the NYAN object into four (k=4) data chunks and creates two additional chunks (m=2). The value of m defines how many OSDs can be lost simultaneously without losing any data. The crush-failure-domain=rack creates a CRUSH rule that ensures no two chunks are stored in the same rack. Figure 1 depicts the erasure code breakdown.
Table 1 lists the supported erasure coding profiles.
| k | M | Minimum number of nodes | Minimum number of OSDs | Redundancy | Storage overhead |
|---|---|---|---|---|---|
| 2 | 2 | 4 | 4 (1 per node) | Loss of up to 1 node + 1 OSD or up to 2 OSDs | 200% |
| 2 | 2 | 5 or more | 5 (1 per node) | Loss of up to 2 nodes or OSDs | 200% |
| 4 | 2 | 7 or more | 7 (1 per node) | Loss of up to 2 nodes or OSDs | 150% |
| 8 | 3 | 12 or more | 12 (1 per node) | Loss of up to 3 nodes or OSDs | 137.50% |
| 8 | 4 | 13 or more | 13 (1 per node) | Loss of up to 4 nodes or OSDs | 150% |
| 8 | 6 | 4 or more | 16 (4 per node) | Loss of up to 1 node + 2 OSDs or up to 6 OSDs | 175% |