Optional IBM Storage Scale configuration and tuning
Note: A restart of IBM
Storage Scale is needed to bring
into effect any configuration option changes that do not successfully complete with the
-i (immediate) option to mmchconfig. For example, changing
the
minMissedPingTimeout
requires a restart.The following configurations can be changed by using mmchconfig as per the
needs of the system workload:
Configuration Options | Default | Recommended | Comment |
---|---|---|---|
metadataDiskWaitTimeForRecovery |
2400(seconds) | Refer comment | Effective when restripeOnDiskFailure is set to yes . The
default value is 40 minutes. This value must be long enough to cover the reboot time of the node
with meta disk. |
dataDiskWaitTimeForRecovery |
3600(seconds | Refer comment | Effective when restripeOnDiskFailure is set to yes . The
default value is 60 minutes. This value must be long enough to cover the reboot time of the node
with data disk. |
syncBuffsPerIteration |
100 | 100 (default) | It is recommended to not change the default value. Substantial improvements resulting from tuning this value have not been observed. |
minMissedPingTimeout |
3(seconds) | 10-60(seconds) | Sets the lower bound on a missed ping timeout. For FPO clusters, a longer grace time is desirable before marking a node as dead, as it impacts all associated disks. Additionally, when running MapReduce workloads, the CPU can become overly busy and cause delayed ping responses. However, a longer timeout implies delay in recovery. A value between 10– 60 seconds is recommended. This value generally provides a good balance between the time to detect the real failures and the rate of false failure detection triggered by a delayed ping response due to CPU or network overload. |
leaseRecoveryWait |
35 | 65 | The default value is 35 and the recommended value is 65. Set a value lower than 65 if you want more rapid recovery from failures. Do not set a value lower than 35. |
prefetchPct |
20(% of pagepool parameter) | See comment | Used by IBM Storage Scale as a guideline to limit the page pool space used for prefetch or write-behind buffers. For MapReduce workloads, generally used for sequential read and write, increase this parameter up to its 60% of the maximum pagepool size. |
maxFilesToCache |
4000 | 100000 | Specifies the number of inodes to cache. Storing the inode of a file in cache permits faster re-access to the file while retrieving location information for data blocks. Increasing this number can improve throughput for workloads with high file reuse, as is the case with Hadoop MapReduce tasks. However, increasing this number excessively can cause paging at the file system manager node. The value must be large enough to handle the number of concurrently open files and allow caching of recently used files. |
nsdInlineWriteMax
|
1,024 | 1,000,000 (or default tuning value) | Defines the amount of data sent in-line with write requests to the NSD server and helps to reduce overhead caused by internode communication. |
nsdThreadsPerDisk |
3 | 8 | For NSDs that are each SATA/SAS disk, set 8 for file systems with the block size of 2MB and 16 for file systems with the block size of 1MB. For NSDs that are LUNs, each containing multiple physical disks, increase this accordingly to the number of drives per NSD. |
nsdSmallThreadRatio
|
0 | 2 | The ratio of the number of small threads to the number of large threads. The recommendation is to change this to 2 for most workloads. |