Cache usage
GPFS creates a number of cache segments on each node in the cluster. The amount of cache is controlled by three attributes.
- pagepool
- The GPFS
pagepool attribute is used to cache user data and file system metadata.
The pagepool attribute allows GPFS to implement read and write requests asynchronously. Increasing the size
of the pagepool attribute increases the amount of data or metadata that
GPFS can cache without requiring synchronous
I/O. The operating system and other software that is running on the node might restrict the amount
of memory available for GPFS on a particular
node.
The optimal size of the pagepool attribute depends on the needs of the application and effective caching of its reaccessed data. For systems where applications access large files, reuse data, benefit from GPFS prefetching of data, or have a random I/O pattern, increasing the value for the pagepool attribute might prove beneficial. However, if the value is set too large, GPFS starts with the maximum that the system allows. See the GPFS log for the value it is running at.
To change the size of the pagepool attribute to 4 GB:mmchconfig pagepool=4G
- maxFilesToCache
- The total number of different files that
can be cached at one time. Every entry in the file cache requires some pageable memory to hold the
content of the file's inode plus control data structures. This is in addition to any of the file's
data and indirect blocks that might be cached in the page pool.
While the total amount of memory, which is required for inodes, attributes and control data structures, varies based on the functions that are being used, it can be estimated as a maximum of 10 KB per file that is cached.
Valid values of maxFilesToCache range from 1 through 100,000,000. For systems where the applications use many files, of any size, increasing the value for maxFilesToCache might prove beneficial. This is true for systems where many small files are accessed. The value must be large enough to handle the number of concurrently open files plus allow caching of recently used files.
If the user does not specify a value for maxFilesToCache, the default value is 4000.Note:For CES nodes where applications use more than 250,000 files, double the value of maxFilesToCache to the files used. For example, for 250,000 files, maxFilesToCache value is recommended to be 500,000.
- maxStatCache
- This parameter sets aside extra pageable
memory to cache attributes of files that are not currently in the regular file cache. This is useful
to improve the performance of both the system and GPFS
stat() calls for applications with a working set that does not fit in the
regular file cache. For systems where applications test the existence of files, or the properties of
files without opening them, as backup applications do, increasing the value for
maxStatCache can improve performance.The memory that is occupied by the stat cache can be calculated as:
maxStatCache × 480 bytes
The valid range for maxStatCache is 0 - 100,000,000. If you do not specify values for maxFilesToCache and maxStatCache, the default value of maxFilesToCache is 4000 and the default value of maxStatCache is 1000. If you specify a value for maxFilesToCache but not for maxStatCache, the default value of maxStatCache is 4 * maxFilesToCache or 10000, whichever is smaller.Note:For improving directory listing performance on CES nodes, double the maxStatCache values to that of maxFilesToCache. For example, for a maxFilesToCache value of 500,000, maxStatCache value is recommended to be 1,000,000.
The total amount of memory GPFS uses to cache file data and metadata is arrived at by adding pagepool to the amount of memory that is required to hold inodes and control data structures (maxFilesToCache × 10 KB), and the memory for the stat cache (maxStatCache × 480 bytes) together. The combined amount of memory to hold inodes, control data structures, and the stat cache is limited to 50% of the physical memory on a node that is running GPFS.
During configuration, you can specify the maxFilesToCache, maxStatCache, and pagepool attributes that control how much cache is dedicated to GPFS. These values can be changed later, so experiment with larger values to find the optimum cache size that improves GPFS performance without negatively affecting other applications.
For more information about these cache settings for GPFS, see GPFS and memory.