|
|
|
|
|
This section describes some of the configuration parameters available in GPFS. Included are some notes on how they may affect performance.
These are GPFS configuration parameters that can be set cluster wide, on a specific node or sets of nodes.
To view the configuration parameters that has been changed from the default
To view the active value of any of these parameters you can run
To change any of these parameters use mmchconfig. For example to change the pagepool setting on all nodes.
Some options take effect immediately using the --i flag to mmchconfig, some take effect after the node is restarted. Refer to the GPFS 3.3 Documentation for details.
|
|
Contents
pagepool
|
The Pagepool parameter determines the size of the GPFS file data cache. Unlike local file systems that use the operating system page cache to cache file data, GPFS allocates its own cache called the pagepool. The GPFS pagepool is used to cache user file data and file system metadata. The default pagepool size of 64MB is too small for many applications so this is a good place to start looking for performance improvement.
Along with file data the pagepool supplies memory for various types of buffers like prefetch and write behind.
For Sequential IO
The default pagepool size may be sufficient for sequential IO workloads, however, a recommended value of 256MB is known to work well in many cases. To change the pagepool size, use the mmchconfig command. For example, to change the pagepool size to 256MB on all nodes in the cluster, execute the mmchconfig command:
mmchconfig pagepool=256M
If the file system blocksize is larger than the default (256K), the pagepool size should be scaled accordingly. For example, if 1M blocksize is used, the default 64M pagepool should be increased by 4 times to 256M. This allows the same number of buffers to be cached.
Random IO
The default pagepool size will likely not be sufficient for Random IO or workloads involving a large number of small files. In some cases allocating 4GB, 8GB or more memory can improve workload performance.
|
|
seqDiscardThreshhold
|
One parameter that affects how data is cached in the pagepool is SeqDiscardTheshold. The SeqDiscardTheshold affects what happens when GPFS detects a sequential access pattern. When a sequential access pattern is detected GPFS allocates an optimal number of prefetch buffers (given available memory) and does not retain any of the file data in the pagepool. This is the highest performing option for the case where a very large is read sequentially. The default for this value is 1MB which means that if you have a file that is sequentially read and is greater than 1MB GPFS will not keep the data in cache. There are some instances where large files are reread often by multiple processes, data analytics for example. In some cases you can improve the performance of these applications by increasing seqDiscardThreshhold to be larger than the file you would like to cache. This will tell GPFS to attempt to keep as much data in cache as possible for the file. The value of seqDiscardThreshhold is file size in bytes.
|
|
maxMBpS
|
The maxMBpS option is an indicator of the maximum throughput in megabytes that can be submitted by GPFS per second into or out of a single node. It is not a hard limit rather the maxMBpS value is a hint to GPFS used to calculate how much I/O can effectively be done for prefetch and write-behind operations. In GPFS 3.3, the default maxMBpS value is 150 and the maximum value is 5,000.
The maxMBpS value should be adjusted for the nodes to match the IO throughput the system is expected to support. For example, you should adjust maxMBps for nodes that are directly attached to storage. A good rule of thumb is to set maxMBps to twice the IO throughput required of a system. For example, if a system has two 4Gbit HBA's (400MB/sec per HBA) maxMBpS should be set to 1600. If the maxMBpS value is set too low sequential IO performance may be reduced.
|
|
maxFilesToCache
|
The maxFilesToCache parameter controls how many files each node can cache. Each file cached requires memory for the inode and a token(lock).
In addition to this parameter, maxStatCache config parameter controls how many files are partially cached; the default value of maxStatCache is 4 * maxFilesToCache, so maxFilesToCache controls five times the number of tokens, times the number of nodes in the cluster. The token manager(s) for a given file system has to keep token state for all nodes in the cluster. This should be considered when setting this value.
One thing to keep in mind is that on a large cluster, a change in the value of maxFilesToCache is greatly magnified. Increasing maxFilesToCache from the default of 1000 by a factor of 2 in a cluster with 200 nodes will increase the number of tokens a server needs to store by approximately 2,000,000. Therefore on large clusters it is recommended that if there is a subset of nodes with the need to have many open files only those nodes should increase the maxFilesToCache parameter. Nodes that may need an increased value for MaxFilesToCache would include: login nodes, email servers or file servers. For systems where applications use a large number of files, of any size, increasing the value for maxFilesToCache may prove beneficial. This is particularly true for systems where a large number of small files are accessed.
The increased value should be large enough to handle the number of concurrently open files plus allow caching of recently used files. You can use mmpmon (See monitoring ) to measure the number of files opened and closed on a GPFS file system. Changing the value of maxFlesToCache effects the amount of memory used on the node.The amount of memory required for inodes and control data structures can be calculated as: maxFilesToCache × 2.5 KB where 2.5 KB = 2 KB + 512 bytes for an inode Valid values of maxFilesToCache range from 1 to 100,000.
|
|
maxStatCache
|
The maxStatCache parameter sets aside additional pageable memory to cache attributes of files that are not currently in the regular file cache. This is useful to improve the performance of both the system and GPFS stat() calls for applications with a working set that does not fit in the regular file cache. The memory occupied by the stat cache can be calculated as: maxStatCache × 176 bytes
Valid values of maxStatCache range from 0 to 10,000,000.
For systems where applications test the existence of files, or the properties of files, without actually opening them (as backup applications do), increasing the value for maxStatCache may prove beneficial. The default value is:4 × maxFilesToCache
On system where maxFilesToCache is greatly increased it is recommended that this value be manually set to something less than 4 * maxFilesToCache. For example if you set maxFIlesToCache to 30,000 you may want to set maxStatCache to 30,000 as well
|
|
maxBufferDescs
|
The value of maxBufferDescs defaults to pagepool size/16K (GPFS 3.2). An issue can arise when the pagepool gets very large, this number can become too large. With very large pagepool settings this value can become larger than you should regularly need and just use up memory. When caching small files, it actually does not need to be more than a small multiple of maxFilesToCache since only OpenFile objects can cache data blocks. So if you are using a pagepool greater than 4GB you should manually set this parameter.
|
|
prefetchPct
|
"prefetchPct" defaults to 20% of pagepool. GPFS uses this as a guideline which limits how much pagepool space will be used for prefetch or writebehind buffers in the case of active sequential streams. The default works well for many applications. On the other hand, if the workload is mostly sequential (video serving/ingest) with very little caching of small files or random IO, then this number should be increased up to its 60% maximum, so that each stream can have more buffers available for prefetch and write behind operations.
|
|
Logfile
|
"Logfile" size should be larger for high metadata rate systems to prevent more glitches when the log has to wrap. Can be as large as 16MB on large blocksize file systems. To set this parameter use the --L flag on mmcrfs.
|
|
worker1threads
|
|
|