Distributing data across a cluster

You can distribute data uniformly across a cluster.

Following are the possible ways to distribute the data:
  • Import the data through a node that does not have any attached NSD and takes the role as a GPFS™ client node in the cluster. This ensures that the data is distributed evenly across all failure groups and all nodes within a failure group.
  • Use a write affinity depth of 0 across the cluster.
  • Make every GPFS node an ingest node and deliver data equally across all ingest nodes. However, this strategy is expensive in terms of implementation.

Ideally, all the failure groups must have an equal number of disks with roughly equal capacity. If one failure group is much smaller than the rest, it is likely to fill up faster than the others, and this complicates rebalancing actions.

After the initial ingesting of data, the cluster might be unbalanced. In such a situation, use the mmrestripefs command with the -b option to rebalance the data.
Note: For FPO users, the mmrestripefs -b command breaks the original data placement that follows the data locality rule.