Setting the OSDs weight by utilization
CRUSH is designed to approximate a uniform probability distribution for write requests that assign new data objects PGs and PGs to OSDs. Despite the CRUSH design, it is possible for clusters to become imbalanced for various reasons. If this occurs, set the OSD weight by utilization.
OSD weight imbalance can occur from various reasons, for example:
-
Multiple Pools: You can assign multiple pools to a CRUSH hierarchy, but the pools might have different numbers of placement groups, size (number of replicas to store), and object size characteristics.
-
Custom Clients: Ceph clients such as block device, object gateway, and filesystem share data from their clients and stripe the data as objects across the cluster as uniform-sized smaller RADOS objects. So except for the foregoing scenario, CRUSH usually achieves its goal. However, there is another case where a cluster can become imbalanced: namely, using
libradosto store data without normalizing the size of objects. This scenario can lead to imbalanced clusters (for example, storing 100 1-MB objects and 10 4-MB objects will make a few OSDs have more data than the others). -
Probability: A uniform distribution will result in some OSDs with more PGs and some with less. For clusters with a large number of OSDs, the statistical outliers will be further out.
You can reweight OSDs by utilization by executing the following:
Syntax
ceph osd reweight-by-utilization [THRESHOLD] [WEIGHT_CHANGE_AMOUNT] [NUMBER_OF_OSDS] [--no-increasing]
Example
[ceph: root@host01 /]# ceph osd test-reweight-by-utilization 110 .5 4 --no-increasing
Where:
-
thresholdis a percentage of utilization such that OSDs facing higher data storage loads will receive a lower weight and thus fewer PGs assigned to them. The default value is120, reflecting 120%. Any value from100+is a valid threshold. Optional. -
weight_change_amountis the amount to change the weight. Valid values are greater than0.0 - 1.0. The default value is0.05. Optional. -
number_of_OSDsis the maximum number of OSDs to reweight. For large clusters, limiting the number of OSDs to reweight prevents significant rebalancing. Optional. -
no-increasingis off by default. Increasing the osd weight is allowed when using thereweight-by-utilizationortest-reweight-by-utilizationcommands. If this option is used with these commands, it prevents the OSD weight from increasing, even if the OSD is underutilized. Optional.
reweight-by-utilization is recommended and
somewhat inevitable for large clusters. Utilization rates might change over time, and as your
cluster size or hardware changes, the weightings might need to be updated to reflect changing
utilization. If you elect to reweight by utilization, you might need to re-run this command as
utilization, hardware or cluster size change.Executing this or other weight commands that assign a weight will override the weight assigned by
this command (for example, osd reweight-by-utilization, osd crush
weight, osd weight, in or out).