Parameters for performance tuning and optimization

Use these parameters with the mmchconfig command for performance tuning and optimization.

Tuning guide for frequently changed parameters

autoload: When autoload is set to yes, GPFS starts automatically on the nodes that are rebooted. The rebooted nodes rejoin the cluster. The file system automount option is set to yes, and the file system is mounted. The default value of this parameter is no.
Important: Set autoload to no before you fix hardware issues and performing system maintenance.

deadlockDetectionThreshold: When deadlockDetectionThreshold is set to 0, the GPFS dead-lock detection feature is disabled. The default value of this parameter is 300 seconds.
Important: You must enable the GPFS dead-lock detection feature to collect debug data and resolve dead lock issue in a cluster. If dead-lock events occur frequently, fix the problem instead of disabling the feature.

defaultHelperNodes: The nodes that are added to defaultHelperNodes are used in running certain commands, such as mmrestripefs. Running the GPFS command on partial nodes in a cluster, such as running the mmrestripefs command on all NSD server nodes, might have a better performance. The default value of this parameter is all nodes in cluster.
Important: Set the –N option for GPFS management commands or change the value of defaultHelperNodes before you run the GPFS management commands.

maxFilesToCache

The maxFilesToCache parameter specifies the number of files that can be cached by each node. The range of valid values for maxFilesToCache is 1 - 100,000,000. The default value is 4000. The value of this parameter must be large enough to handle the number of concurrently open files and to allow the caching of recently used files.

Changing the value of maxFilesToCache affects the amount of memory that is used on the node. In a large cluster, a change in the value of maxFilesToCache is greatly magnified. Increasing maxFilesToCache in a large cluster with hundreds of nodes increases the number of tokens a token manager needs to store. Ensure that the manager node has enough memory and tokenMemLimit is increased when you are running GPFS version 4.1.1 and earlier. Therefore, increasing the value of maxFilesToCache on large clusters usually happens on a subset of nodes that are used as log-in nodes, SMB and NFS exporters, email servers, and other file servers.

For systems on which applications use many files, increasing the value of maxFilesToCache might be beneficial, especially where many small files are accessed.

Trouble: Setting the maxFilesToCache parameter to a high value results in a large amount of memory that is being allocated for internal data buffering. If the value of maxFilesToCache is set too high, some operations in IBM Storage Scale might not have enough memory to run in. If you set maxFilesToCache to a high value, then an error message might appear in the mmfs.log indicating that there is insufficient memory to perform an operation. To rectify, the error, try to lower the value of maxFilesToCache.

maxBlockSize: The value of maxBlockSize must be equal to or larger than the maximum block size of all the file systems in the local and remote clusters. Before you change this parameter, ensure that the GPFS daemon on each node in the cluster is shut down. The default value is 4 MB.
Note: When you migrate a cluster from an earlier version to version 5.0.0 or later, the value of maxblocksize stays the same. However, if maxblocksize was set to DEFAULT in the earlier version of the cluster, then migrating it to version 5.0.0 or later sets it explicitly to 1 MiB, that is, its default size in earlier versions. To change maxBlockSize to the default size after you migrate to version 5.0.0 or later, set maxblocksize=DEFAULT (4 MiB).
For more information, see mmcrfs command and mmchconfig command.

maxMBpS

The maxMBpS parameter indicates the maximum throughput in megabytes per second that GPFS can submit into or out of a single node. GPFS calculates from this variable how many prefetch or writebehind threads to schedule for sequential file access.

In GPFS version 3.5 and later, the default value is 2048. But if the node has faster interconnect, such as InfiniBand or 40 GigE or multiple links, you can set the parameter to a higher value. As a rule, try setting maxMBpS to twice the I/O throughput that the node can support. For example, if the node has 1 x FDR link and the GPFS configuration parameter verbRdma is enabled, then the expected throughput of the node is 6000 MB/s. In this case, set maxMBpS to 12000.

Setting maxMBpS does not guarantee the required GPFS sequential bandwidth on the node. All the layers of the GPFS stack, including the node, the network, and the storage subsystem, must be designed and tuned to meet the I/O performance requirements.

maxStatCache

The maxStatCache parameter sets aside the pageable memory to cache attributes of files that are not currently in the regular file cache. This improves the performance of stat() calls for applications with a working set that does not fit in the regular file cache. For systems where applications test the existence of files, or the properties of files, without opening them as backup applications do, increasing the value for maxStatCache can be beneficial.

For information about the default values of maxFilesToCache and maxStatCache, see the description of the maxStatCache attribute in the topic mmchconfig command.

In versions of IBM Storage Scale earlier than 5.0.2, the stat cache is not effective on the Linux® platform unless the Local Read-Only Cache (LROC) is configured. For more information, see the description of the maxStatCache parameter in the topic mmchconfig command.

nsdMaxWorkerThreads: NSD server tuning. For more information about nsdMaxWorkerThreads, see mmchconfig command.

pagepool: The pagepool parameter is used to change the size of the data cache on each node. The default value is either one-third of the physical memory of the node or 4 GiB, whichever is smaller. The default value applies to new clusters that are installed with IBM Storage Scale 5.2.0 or higher. Otherwise, the existing default value is used. On upgrades, the existing default value is maintained.; The maximum GPFS pagepool size depends on the value of the pagepoolMaxPhysMemPct parameter and the amount of physical memory on the node. Unlike local file systems that use the operating system page cache to cache file data, GPFS allocates its own cache that is called the page pool. The GPFS page pool is used to cache user file data and file system metadata. Along with file data, the page pool supplies memory for various types of buffers such as prefetch and write behind. The default page pool size might be sufficient for sequential IO workloads. The default page pool size might not be sufficient for Random IO or workloads that involve multiple small files.; In some cases, allocating 4 GB, 8 GB, or more memory can improve the workload performance. For database applications that use Direct IO, the page pool is not used for any user data. The main purpose in this case is for system metadata and caching the indirect blocks for the files. For NSD server, if no applications or file system manager services are running on NSD server, the page pool is only used transiently by the NSD worker threads to gather data from client nodes and write the data to disk. The NSD server does not cache any of the data.

readReplicaPolicy: The readReplicaPolicy parameter specifies the location from which the disk must read the replicas. The valid values are default, local and fastest. The default value is default.; By default, GPFS reads the first replica even when there is no replica on the local disk. When the value of this parameter is set to local, the policy reads replicas from the local disk only if the local disk has data. An NSD server on the same subnet as the client is also considered as local. For performance considerations, this is the recommended setting for FPO environments.; When the value of this parameter is set to fastest, the policy reads replicas from the disk considering the fastest based on the read I/O statistics of the disk. In a system with SSD and regular disks, the value of fastestPolicyCmpThreshold can be set to a greater number, such as 100, to let GPFS refresh the slow disk speed statistics less frequently.

restripeOnDiskFailure: The restripeOnDiskFailure specifies whether GPFS attempts to automatically recover from certain common disk failure situations. The default value of this parameter is no.
Important: While you deploy FPO or when the HAWC feature is enabled, set the restripeOnDiskFailure parameter to yes.

tiebreakerDisks: For a small cluster with up to eight nodes that have SAN-attached disk systems, define all nodes as quorum nodes and use tiebreaker disks. With more than eight nodes, use only node quorum. While you are defining the tiebreaker disks, you can use the SAN-attached NSD in the file system. The default value of this parameter is null, which means no tiebreaker disk is defined.

unmountOnDiskFail

The unmountOnDiskFail attribute controls how the GPFS daemon responds when it detects a disk failure. For more information, see the topic mmchconfig command.

Important:

Set the value of unmountOnDiskFail to meta in the following situations:

FPO deployment.
When the metadata and data replicas are more than one.

workerThreads

The workerThreads parameter controls an integrated group of variables that tune the file system performance in environments that are capable of high sequential and random read and write workloads and small file activity.

The default value of this parameter is Start of change

256

for a base IBM Storage Scale cluster and 512 for a cluster with protocols installed. Start of change

The value for the base cluster applies to new clusters that are installed with IBM Storage Scale 5.2.0 or higher. Otherwise, the existing default value is used. End of change

A valid value can be any number in the range 1 - 8192. The -N flag is valid with this variable. This variable controls both internal and external variables. The internal variables include maximum settings for concurrent file operations, for concurrent threads that flush dirty data and metadata, and for concurrent threads that prefetch data and metadata. You can adjust the following external variables with the mmchconfig command:

logBufferCount
preFetchThreads
worker3Threads