Configuring IBM Storage Ceph stretch mode
Once the IBM Storage
Ceph cluster is fully deployed
using cephadm, use this information to configure the stretch cluster mode. The new
stretch mode is designed to handle the 2-site case.
Procedure
- Check the current election strategy being used by the monitors with the ceph mon
dump command. By default in a ceph cluster, the connectivity is set to classic.
Example output:ceph mon dump | grep election_strategydumped monmap epoch 9 election_strategy: 1 - Change the monitor election to connectivity.
ceph mon set election_strategy connectivity - Rerun the ceph mon dump command to verify the
election_strategyvalue.
Example output:ceph mon dump | grep election_strategydumped monmap epoch 10 election_strategy: 3To know more about the different election strategies, see Administering > Operations > Management of monitors > Configuring monitor election strategy within the IBM Storage Ceph documentation.
- Set the location for all Ceph monitors.
ceph mon set_location ceph1 datacenter=DC1 ceph mon set_location ceph2 datacenter=DC1 ceph mon set_location ceph4 datacenter=DC2 ceph mon set_location ceph5 datacenter=DC2 ceph mon set_location ceph7 datacenter=DC3 - Verify that each monitor has its appropriate
location.
Example output:ceph mon dumpepoch 17 fsid dd77f050-9afe-11ec-a56c-029f8148ea14 last_changed 2022-03-04T07:17:26.913330+0000 created 2022-03-03T14:33:22.957190+0000 min_mon_release 16 (pacific) election_strategy: 3 0: [v2:10.0.143.78:3300/0,v1:10.0.143.78:6789/0] mon.ceph1; crush_location {datacenter=DC1} 1: [v2:10.0.155.185:3300/0,v1:10.0.155.185:6789/0] mon.ceph4; crush_location {datacenter=DC2} 2: [v2:10.0.139.88:3300/0,v1:10.0.139.88:6789/0] mon.ceph5; crush_location {datacenter=DC2} 3: [v2:10.0.150.221:3300/0,v1:10.0.150.221:6789/0] mon.ceph7; crush_location {datacenter=DC3} 4: [v2:10.0.155.35:3300/0,v1:10.0.155.35:6789/0] mon.ceph2; crush_location {datacenter=DC1} - Create a CRUSH rule that makes use of this OSD crush topology by installing the
ceph-baseRPM package in order to use the crushtool command.dnf -y install ceph-baseTo know more about CRUSH ruleset, see Concepts > Architecture > Core Ceph components > Ceph CRUS ruleset within the IBM Storage Ceph documentation.
- Get the compiled CRUSH map from the cluster.
ceph osd getcrushmap > /etc/ceph/crushmap.bin - Decompile the CRUSH map and convert it to a text file in order to be able to edit
it.
crushtool -d /etc/ceph/crushmap.bin -o /etc/ceph/crushmap.txt - Add the following rule to the CRUSH map by editing the text file
/etc/ceph/crushmap.txt at the end of the file.
vim /etc/ceph/crushmap.txtrule stretch_rule { id 1 type replicated min_size 1 max_size 10 step take default step choose firstn 0 type datacenter step chooseleaf firstn 2 type host step emit } # end crush mapThis example is applicable for active applications in both OpenShift Container Platform clusters.Note: The ruleidhas to be unique. In this example, there is only one more CRUSH rule withid 0hence we are usingid 1. If your deployment has more rules created, then use the next free ID.The CRUSH rule declared contains the following information:-
Rule name - A unique whole name for identifying the rule. In this example,
stretch_rule. -
id - A unique whole number for identifying the rule. In this example,
1. -
type - A rule for either a storage drive replicated or erasure-coded. In this example,
replicated. -
min_size - If a pool makes fewer replicas than this number, CRUSH will not select this rule. In this
example,
1. max_size- If a pool makes more replicas than this number, CRUSH will not select this rule. In this
example,
10. step take default- Takes the root bucket called
default, and begins iterating down the tree. step choose firstn 0 type datacenter- Selects the datacenter bucket, and goes into it’s subtrees.
step chooseleaf firstn 2 type host- Selects the number of buckets of the given type. In this case, it is two different hosts located in the datacenter it entered at the previous level.
step emit- Outputs the current value and empties the stack. Typically used at the end of a rule, but may also be used to pick from different trees in the same rule.
-
- Compile the new CRUSH map from the file /etc/ceph/crushmap.txt and
convert it to a binary file called /etc/ceph/crushmap2.bin.
crushtool -c /etc/ceph/crushmap.txt -o /etc/ceph/crushmap2.bin - Inject the newly created CRUSH map back into the
cluster.
Example output: 17ceph osd setcrushmap -i /etc/ceph/crushmap2.binNote: The number 17 is a counter and it will increase (18,19, and so on) depending on the changes made to the CRUSH map. - Verify that the stretched rule created is now available for
use.
Example output:ceph osd crush rule lsreplicated_rule stretch_rule - Enable the stretch cluster
mode.
In this example,ceph mon enable_stretch_mode ceph7 stretch_rule datacenterceph7is the arbiter node,stretch_ruleis the crush rule we created in the previous step anddatacenteris the dividing bucket. - Verify all our pools are using the
stretch_ruleCRUSH rule that was created as part of the Ceph cluster.for pool in $(rados lspools);do echo -n "Pool: ${pool}; ";ceph osd pool get ${pool} crush_rule;doneExample output:
This indicates that a working IBM Storage Ceph stretched cluster with arbiter mode is now available.Pool: device_health_metrics; crush_rule: stretch_rule Pool: cephfs.cephfs.meta; crush_rule: stretch_rule Pool: cephfs.cephfs.data; crush_rule: stretch_rule Pool: .rgw.root; crush_rule: stretch_rule Pool: default.rgw.log; crush_rule: stretch_rule Pool: default.rgw.control; crush_rule: stretch_rule Pool: default.rgw.meta; crush_rule: stretch_rule Pool: rbdpool; crush_rule: stretch_rule