Suboptimal performance due to incompatible file system block allocation type

In some cases, proof-of-concept (POC) is done on a smaller setup that consists of clusters with eight or fewer nodes and file system with eight or fewer disks. When the necessary performance requirements are met, the production file system is deployed on a larger cluster and storage setup. It is possible that on a larger cluster, the file performance per NSD is less compared to the smaller POC setup, even if all the cluster and storage component are healthy and performing optimally. In such cases, it is likely that the file system is configured with the default cluster block allocation type during the smaller POC setup and the larger file system setup are configured with scatter block allocation type.

Problem identification

Issue the mmlsfs command to verify the block allocation type that is in effect on the smaller and larger setup file system.

In the sample output below, the Block allocation type for the gpfs2 file system is set to scatter.

# mmlsfs gpfs2 | grep 'Block allocation type'

 -j                 scatter                 Block allocation type

Problem resolution and verification

layoutMap={scatter|cluster} specifies the block allocation map type. When allocating blocks for a file, GPFS first uses a round robin algorithm to spread the data across all disks in the storage pool. After a disk is selected, the location of the data block on the disk is determined by the block allocation map type.

For cluster block allocation map type, GPFS attempts to allocate blocks in clusters. Blocks that belong to a particular file are kept adjacent to each other within each cluster. For scatter block allocation map type, the location of the block is chosen randomly. For production setup, where performance consistency throughout the lifetime of the file system is paramount, scatter block allocation type is recommended. The IBM Storage Scale storage I/O performance sizing also needs to be performed by using the scatter block allocation.

The cluster allocation method might provide better disk performance for some disk subsystems in relatively small installations. However, the benefits of clustered block allocation diminish when the number of nodes in the cluster or the number of disks in a file system increases, or when the file system’s free space becomes fragmented. The cluster allocation is the default allocation method for GPFS clusters with eight or fewer nodes and for file systems with eight or fewer disks.

The scatter allocation method provides more consistent file system performance by averaging out performance variations. This is so because for many disk subsystems, the location of the data relative to the disk edge has a substantial effect on the performance. This allocation method is appropriate in most cases and is the default allocation type for GPFS clusters with more than eight nodes or file systems with more than eight disks.

The block allocation map type cannot be change after the storage pool is created. For more information on block allocation, see the mmcrfs command.

Attention: Scatter block allocation is recommended for a production setup where performance consistency is paramount throughout the lifetime of the file system. However, in an FPO environments (Hadoop or Big Data), cluster block allocation is recommended.