mmchconfig command
Changes GPFS™ configuration parameters.
Synopsis
mmchconfig Attribute=value[,Attribute=value...] [-i | -I]
[-N {Node[,Node...] | NodeFile | NodeClass}]
Availability
Available on all IBM Spectrum Scale™ editions.
Description
Use the mmchconfig command to change the GPFS configuration attributes on a single node, a set of nodes, or globally for the entire cluster.
- Specify pagepool first if you are increasing the values.
- Specify maxblocksize first if you are decreasing the values.
Results
The configuration is updated on each node in the GPFS cluster.
Parameters
- -I
- Specifies that the changes take effect immediately, but do not
persist when GPFS is restarted.
This option is valid only for the following attributes:
- deadlockBreakupDelay
- deadlockDataCollectionDailyLimit
- deadlockDataCollectionMinInterval
- deadlockDetectionThreshold
- deadlockDetectionThresholdForShortWaiters
- deadlockDetectionThresholdIfOverloaded
- deadlockOverloadThreshold
- dmapiEventTimeout
- dmapiMountTimeoout
- dmapiSessionFailureTimeout
- expelDataCollectionDailyLimit
- expelDataCollectionMinInterval
- fastestPolicyCmpThreshold
- fastestPolicyMaxValidPeriod
- fastestPolicyMinDiffPercent
- fastestPolicyNumReadSamples
- fileHeatLossPercent
- fileHeatPeriodMinutes
- lrocData
- lrocDataMaxFileSize
- lrocDataStubFileSize
- lrocDirectories
- lrocInodes
- maxMBpS
- nsdBufSpace
- pagepool
- pitWorkerThreadsPerNode
- readReplicaPolicy
- systemLogLevel
- unmountOnDiskFail
- verbsRdmaRoCEToS
- verbsRdmasPerConnection
- verbsRdmasPerNode
- verbsSendBufferMemoryMB
- worker1Threads (only when adjusting value down)
- -i
- Specifies that the changes take effect immediately and are permanent.
This option is valid only for the following attributes:
- cesSharedRoot
- cnfsGrace
- cnfsMountdPort
- cnfsNFSDprocs
- cnfsReboot
- cnfsSharedRoot
- cnfsVersions
- dataDiskWaitTimeForRecovery
- deadlockBreakupDelay
- deadlockDataCollectionDailyLimit
- deadlockDataCollectionMinInterval
- deadlockDetectionThreshold
- deadlockDetectionThresholdForShortWaiters
- deadlockDetectionThresholdIfOverloaded
- deadlockOverloadThreshold
- disableInodeUpdateOnFDatasync
- dmapiEventTimeout
- dmapiMountTimeoout
- dmapiSessionFailureTimeout
- expelDataCollectionDailyLimit
- expelDataCollectionMinInterval
- fastestPolicyCmpThreshold
- fastestPolicyMaxValidPeriod
- fastestPolicyMinDiffPercent
- fastestPolicyNumReadSamples
- fileHeatLossPercent
- fileHeatPeriodMinutes
- forceLogWriteOnFdatasync
- lrocData
- lrocDataMaxFileSize
- lrocDataStubFileSize
- lrocDirectories
- lrocInodes
- maxDownDisksForRecovery
- maxFailedNodesForRecovery
- maxMBpS
- metadataDiskWaitTimeForRecovery
- minDiskWaitTimeForRecovery
- nsdBufSpace
- pagepool
- pitWorkerThreadsPerNode
- readReplicaPolicy
- restripeOnDiskFailure
- systemLogLevel
- unmountOnDiskFail
- verbsRdmaRoCEToS
- verbsRdmasPerConnection
- verbsRdmasPerNode
- verbsSendBufferMemoryMB
- worker1Threads (only when adjusting value down)
- -N {Node[,Node...] | NodeFile | NodeClass}
- Specifies the set of nodes to which the configuration changes
apply. The default is -N all.
For information on how to specify node names, see the topic Specifying nodes as inputs to GPFS commands in the IBM Spectrum Scale: Advanced Administration Guide.
To see a complete list of the attributes for which the -N flag is valid, see the list of node names allowed in the topic Changing the GPFS cluster configuration data in the IBM Spectrum Scale: Advanced Administration Guide.
This command does not support a NodeClass of mount.
- Attribute=value
- Specifies the name of the attribute to be changed and its associated value.
More than one attribute and value pair can be specified. To restore
the GPFS default setting for
an attribute, specify DEFAULT as its value.
This command accepts the following attributes:
- adminMode
- Specifies
whether all nodes in the cluster are used for issuing GPFS administration commands or just a subset
of the nodes. Valid values are:
- allToAll
- Indicates that all nodes in the cluster are used for running GPFS administration commands and that all nodes are able to execute remote commands on any other node in the cluster without the need of a password.
- central
- Indicates that only a subset of the nodes is used for running GPFS commands and that only those nodes are able to execute remote commands on the rest of the nodes in the cluster without the need of a password.
For additional information, see the following IBM Spectrum Scale: Administration and Programming Reference topic: Requirements for administering a GPFS file system.
- afmAsyncDelay
- Specifies (in seconds) the amount of time by which write operations
are delayed (because write operations are asynchronous with respect
to remote clusters). For write-intensive applications that keep writing
to the same set of files, this delay is helpful because it replaces
multiple writes to the home cluster with a single write containing
the latest data. However, setting a very high value weakens the consistency
of data on the remote cluster.
This configuration parameter is applicable only for writer caches (SW and IW), where data from cache is pushed to home.
Valid values are 1 through 2147483647. The default is 15.
- afmDirLookupRefreshInterval
- Controls the frequency of data revalidations that are triggered
by such lookup operations as ls or stat (specified
in seconds). When a lookup operation is done on a directory, if the
specified amount of time has passed, AFM sends a message to the home
cluster to find out whether the metadata of that directory has been
modified since the last time it was checked. If the time interval
has not passed, AFM does not check the home cluster for updates to
the metadata.
Valid values are 0 through 2147483647. The default is 60. In situations where home cluster data changes frequently, a value of 0 is recommended.
- afmDirOpenRefreshInterval
- Controls the frequency of data revalidations that are triggered
by such I/O operations as read or write (specified
in seconds). After a directory has been cached, open requests
resulting from I/O operations on that object are directed to the cached
directory until the specified amount of time has passed. Once the
specified amount of time has passed, the open request
gets directed to a gateway node rather than to the cached directory.
Valid values are 0 through 2147483647. The default is 60. Setting a lower value guarantees a higher level of consistency.
- afmDisconnectTimeout
- The Waiting period in seconds to detect the status of the home cluster. If the home cluster is inaccessible, the metadata server (MDS) changes the state from 'cache' to 'disconnected'.
- afmExpirationTimeout
- Is used with afmDisconnectTimeout (which
can be set only through mmchconfig) to control
how long a network outage between the cache and home clusters can
continue before the data in the cache is considered out of sync with
home. After afmDisconnectTimeout expires,
cached data remains available until afmExpirationTimeout expires,
at which point the cached data is considered expired and cannot be
read until a reconnect occurs.
Valid values are 0 through 2147483647. The default is 300.
- afmFileLookupRefreshInterval
- Controls the frequency of data revalidations that are triggered
by such lookup operations as ls or stat (specified
in seconds). When a lookup operation is done on a file, if the specified
amount of time has passed, AFM sends a message to the home cluster
to find out whether the metadata of the file has been modified since
the last time it was checked. If the time interval has not passed,
AFM does not check the home cluster for updates to the metadata.
Valid values are 0 through 2147483647. The default is 30. In situations where home cluster data changes frequently, a value of 0 is recommended.
- afmFileOpenRefreshInterval
- Controls the frequency of data revalidations that are triggered
by such I/O operations as read or write (specified
in seconds). After a file has been cached, open requests
resulting from I/O operations on that object are directed to the cached
file until the specified amount of time has passed. Once the specified
amount of time has passed, the open request
gets directed to a gateway node rather than to the cached file.
Valid values are 0 through 2147483647. The default is 30. Setting a lower value guarantees a higher level of consistency.
- afmHardMemThreshold
-
Sets a limit to the maximum amount of memory that AFM can use on each gateway node to record changes to the file system. After this limit is reached, the fileset goes into a 'dropped' state.
Exceeding the limit and the fileset going into a 'dropped' state due to accumulated pending requests might occur if -- the cache cluster is disconnected for an extended period of time.
- the connection with the home cluster is on a low bandwidth.
Reboot the gateway node after you change the value.
- afmHashVersion
- Specifies an older or newer version of gateway node hashing algorithm (for example, mmchconfig afmHashVersion=2). This can be used to minimize the impact of gateway nodes joining or leaving the active cluster by running as few recoveries as much as possible. Valid values are 1 or 2.
- afmNumReadThreads
- Defines the number of threads that can be used on each participating gateway node during parallel read. The default value of this parameter is 1; that is, one reader thread will be active on every gateway node for each big write operation qualifying for splitting per the parallel read threshold value. The valid range of values is 1 to 64.
- afmNumWriteThreads
- Defines the number of threads that can be used on each participating gateway node during parallel write. The default value of this parameter is 1; that is, one writer thread will be active on every gateway node for each big write operation qualifying for splitting per the parallel write threshold value. Valid values can range from 1 to 64.
- afmParallelReadChunkSize
- Defines the minimum chunk size of the read that needs to be distributed among the gateway nodes during parallel reads. Values are interpreted in terms of bytes. The default value of this parameter is 128 MB, and the valid range of values is 0 to 2147483647. It can be changed cluster wide with the mmchconfig command. It can be set at fileset level using mmcrfileset or mmchfileset commands.
- afmParallelReadThreshold
- Defines the threshold beyond which parallel reads become effective. Reads are split into chunks when file size exceeds this threshold value. Values are interpreted in terms of MB. The default value is 1024 MB. The valid range of values is 0 to 2147483647. It can be changed cluster wide with the mmchconfig command. It can be set at fileset level using mmcrfileset or mmchfileset commands.
- afmParallelWriteChunkSize
- Defines the minimum chunk size of the write that needs to be distributed among the gateway nodes during parallel writes. Values are interpreted in terms of bytes. The default value of this parameter is 128 MB, and the valid range of values is 0 to 2147483647. It can be changed cluster wide with the mmchconfig command. It can be set at fileset level using mmcrfileset or mmchfileset commands.
- afmParallelWriteThreshold
- Defines the threshold beyond which parallel writes become effective. Writes are split into chunks when file size exceeds this threshold value. Values are interpreted in terms of MB. The default value of this parameter is 1024 MB, and the valid range of values is 0 to 2147483647. It can be changed cluster wide with the mmchconfig command. It can be set at fileset level using mmcrfileset or mmchfileset commands.
- afmReadSparseThreshold
- Specifies the size in MB for files in cache beyond which sparseness is maintained. For all files below the specified threshold, sparseness is not maintained.
- afmSecondaryRW
- Specifies if the secondary is read-write or not.
- yes
- Specifies that the secondary is read-write.
- no
- Specifies that the secondary is not read-write.
- afmShowHomeSnapshot
- Controls the visibility of the home snapshot directory in cache.
For this to be visible in cache, this variable has to be set to yes,
and the snapshot directory name in cache and home should not be the
same.
- yes
- Specifies that the home snapshot link directory is visible.
- no
- Specifies that the home snapshot link directory is not visible.
See also the topic about peer snapshots in the IBM Spectrum Scale: Advanced Administration Guide.
- atimeDeferredSeconds
- Controls
the update behavior of atime when the relatime option
is enabled. The default value is 86400 seconds (24 hours). A value
of 0 effectively disables relatime and causes
the behavior to be the same as the atime setting.
For more information, see the topic GPFS-specific mount options in the IBM Spectrum Scale: Advanced Administration Guide.
- autoload
- Starts GPFS automatically whenever the
nodes are rebooted. Valid values are yes or no.
The -N flag is valid for this attribute.
- automountDir
- Specifies the directory to be used by the Linux automounter for GPFS file systems that are being mounted automatically. The default directory is /gpfs/automountdir. This parameter does not apply to AIX® and Windows environments.
- cesSharedRoot
- Specifies
a directory in a GPFS file system
to be used by the Cluster Export Services (CES) subsystem.
GPFS must be down on all CES nodes in the cluster when changing the cesSharedRoot attribute.
- cipherList
- Sets
the security mode for the cluster. The security mode determines the
level of the security that the cluster provides for communications
between nodes in the cluster and also for communications with other
clusters. There are three security modes:
- EMPTY
- The sending node and the receiving node do not authenticate each other, do not encrypt transmitted data, and do not check data integrity.
- AUTHONLY
- The sending and receiving nodes authenticate each other, but they do not encrypt transmitted data and do not check data integrity. This mode is the default in IBM Spectrum Scale V4.2 or later.
- Cipher
- The sending and receiving nodes authenticate each other, encrypt transmitted data, and check data integrity. To set this mode, you must specify the name of a supported cipher, such as AES128-GCM-SHA256.
- cnfsGrace
- Specifies
the number of seconds a CNFS node will deny new client requests after
a node failover or failback, to allow clients with existing locks
to reclaim them without the possibility of some other client being
granted a conflicting access. For v3, only new lock requests are denied.
For v4, new lock, read and write requests are rejected. Note that
the cnfsGrace value also determines the
time period for the server lease.
Valid values are 10 through 600. The default is 90 seconds. A short grace period is good for fast server failover, however it comes at the cost of increased load on server to effect lease renewal.
GPFS must be down on all CNFS nodes in the cluster when changing the cnfsGrace attribute.
- cnfsMountdPort
- Specifies the port number to be used for rpc.mountd. See the IBM Spectrum Scale: Advanced Administration Guide for restrictions and additional information.
- cnfsNFSDprocs
- Specifies the number of nfsd kernel threads. The default is 32.
- cnfsReboot
- Specifies
whether the node will reboot when CNFS monitoring detects an unrecoverable
problem that can only be handled by node failover.
Valid values are yes or no. The default is yes and recommended. If node reboot is not desired for other reasons, it should be noted that clients that were communicating with the failing node are likely to get errors or hang. CNFS failover is only guaranteed with cnfsReboot enabled.
The -N flag is valid for this attribute.
- cnfsSharedRoot
- Specifies
a directory in a GPFS file system
to be used by the clustered NFS subsystem.
GPFS must be down on all CNFS nodes in the cluster when changing the cnfsSharedRoot attribute.
See the IBM Spectrum Scale: Advanced Administration Guide for restrictions and additional information.
- cnfsVersions
- Specifies
a comma-separated list of protocol versions that CNFS should start
and monitor.
The default is 3,4.
GPFS must be down on all CNFS nodes in the cluster when changing the cnfsVersions attribute.
See the IBM Spectrum Scale: Advanced Administration Guide for additional information.
- dataDiskWaitTimeForRecovery
- Specifies
a period of time, in seconds, during which the recovery of dataOnly disks
is suspended to give the disk subsystem a chance to correct itself.
This parameter is taken into account when the affected disks belong
to a single failure group. If more than one failure group is affected,
the delay is based on the value of minDiskWaitTimeForRecovery.
Valid values are between 0 and 3600 seconds. The default is 3600. If restripeOnDiskFailure is no, dataDiskWaitTimeForRecovery has no effect.
- deadlockBreakupDelay
- Specifies
how long to wait after a deadlock is detected before attempting to
break up the deadlock. Enough time must be provided to allow the debug
data collection to complete.
The default is 0, which means that the automated deadlock breakup is disabled. A positive value will enable the automated deadlock breakup. If automated deadlock breakup is to be enabled, a delay of 300 seconds or longer is recommended.
- deadlockDataCollectionDailyLimit
- Specifies the maximum number of times
that debug data can be collected every 24 hours.
The default is 10. If the value is 0, then no debug data is collected when a potential deadlock is detected.
- deadlockDataCollectionMinInterval
- Specifies
the minimum interval between two consecutive collections of debug
data.
The default is 300 seconds.
- deadlockDetectionThreshold
- Specifies the deadlock
detection threshold. A suspected deadlock is detected when a waiter
waits longer than deadlockDetectionThreshold.
The default is 300 seconds. If the value is 0, then automated deadlock detection is disabled.
- deadlockDetectionThresholdForShortWaiters
- Specifies
the deadlock detection threshold for short waiters that should never
be long.
The default is 60 seconds.
- deadlockDetectionThresholdIfOverloaded
- Specifies
the deadlock detection threshold to use when a cluster is overloaded.
The default is 1800 seconds.
- deadlockOverloadThreshold
- Specifies
the threshold for detecting a cluster overload condition.
The default is 5 seconds. If the value is 0, then overload detection is disabled.
- defaultHelperNodes
- For
commands that distribute work among a set of nodes, the defaultHelperNodes parameter
specifies the nodes to be used. When specifying values, follow the
rules described for the -N parameter.
To override this setting when using such commands, explicitly specify the helper nodes with -N.
The commands that use -N for this purpose are the following: mmadddisk, mmapplypolicy, mmbackup, mmchdisk, mmcheckquota, mmdefragfs, mmdeldisk, mmdelsnapshot, mmfileid, mmfsck, mmimgbackup, mmimgrestore, mmrestorefs, mmrestripefs, and mmrpldisk.
NodeClass values are listed.
For general information on how to specify node names, see Specifying nodes as input to GPFS commands in the IBM Spectrum Scale: Administration and Programming Reference.
- defaultMountDir
- Specifies the default parent directory for GPFS file systems. The default value is /gpfs. If an explicit mount directory is not provided with the mmcrfs, mmchfs, or mmremotefs command, the default mount point is set to DefaultMountDir/DeviceName.
- disableInodeUpdateOnFdatasync
- Controls
the inode update on fdatasync for mtime and atime updates. Valid values
are yes or no
When disableInodeUpdateOnFdatasync is set to yes, the inode object is not updated on disk for mtime and atime updates on fdatasync() calls. File size updates are always synced to the disk.
When disableInodeUpdateOnFdatasync is set to no, the inode object is updated with the current mtime on fdatasync() calls. This is the default.
- dmapiDataEventRetry
- Controls
how GPFS handles data events
that are enabled again immediately after the event is handled by the
DMAPI application. Valid values are as follows:
- -1
- Specifies that GPFS always regenerates the event as long as it is enabled. This value should only be used when the DMAPI application recalls and migrates the same file in parallel by many processes at the same time.
- 0
- Specifies to never regenerate the event. This value should not be used if a file could be migrated and recalled at the same time.
- RetryCount
- Specifies the number of times the data event should be retried. The default is 2.
For further information regarding DMAPI for GPFS, see the IBM Spectrum Scale: Data Management API Guide.
- dmapiEventTimeout
- Controls
the blocking of file operation threads of NFS, while in the kernel
waiting for the handling of a DMAPI synchronous event. The parameter
value is the maximum time, in milliseconds, the thread blocks. When
this time expires, the file operation returns ENOTREADY,
and the event continues asynchronously. The NFS server is expected
to repeatedly retry the operation, which eventually finds the response
of the original event and continue. This mechanism applies only to
read, write, and truncate event types, and only when such events come
from NFS server threads. The timeout value is given in milliseconds.
The value 0 indicates immediate timeout (fully asynchronous event).
A value greater than or equal to 86400000 (which is 24 hours) is considered infinity (no
timeout, fully synchronous event). The default value is 86400000.
For further information regarding DMAPI for GPFS, see the IBM Spectrum Scale: Data Management API Guide.
The -N flag is valid for this attribute.
- dmapiMountEvent
- Controls
the generation of the mount, preunmount,
and unmount events. Valid values are:
- all
- mount, preunmount, and unmount events are generated on each node. This is the default behavior.
- SessionNode
- mount, preunmount, and unmount events are generated on each node and are delivered to the session node, but the session node does not deliver the event to the DMAPI application unless the event is originated from the SessionNode itself.
- LocalNode
- mount, preunmount, and unmount events are generated only if the node is a session node.
The -N flag is valid for this attribute.
For further information regarding DMAPI for GPFS, see the IBM Spectrum Scale: Data Management API Guide.
- dmapiMountTimeout
- Controls
the blocking of mount operations, waiting
for a disposition for the mount event to be set. This timeout is activated,
at most once on each node, by the first external mount of a file system
that has DMAPI enabled, and only if there has never before been a
mount disposition. Any mount operation on
this node that starts while the timeout period is active waits for
the mount disposition. The parameter value is the maximum time, in
seconds, that the mount operation waits
for a disposition. When this time expires and there is still no disposition
for the mount event, the mount operation
fails, returning the EIO error. The timeout
value is given in full seconds. The value 0 indicates immediate timeout
(immediate failure of the mount operation). A value greater than or
equal to 86400 (which is 24 hours) is considered infinity (no
timeout, indefinite blocking until there is a disposition). The default
value is 60.
The -N flag is valid for this attribute.
For further information regarding DMAPI for GPFS, see the IBM Spectrum Scale: Data Management API Guide.
- dmapiSessionFailureTimeout
- Controls
the blocking of file operation threads, while in the kernel, waiting
for the handling of a DMAPI synchronous event that is enqueued on
a session that has experienced a failure. The parameter value is the
maximum time, in seconds, the thread waits for the recovery of the
failed session. When this time expires and the session has not yet
recovered, the event is cancelled and the file operation fails, returning
the EIO error. The timeout value is given
in full seconds. The value 0 indicates immediate timeout (immediate
failure of the file operation). A value greater than or equal to 86400
(which is 24 hours) is considered infinity (no
timeout, indefinite blocking until the session recovers). The default
value is 0.
For further information regarding DMAPI for GPFS, see the IBM Spectrum Scale: Data Management API Guide.
The -N flag is valid for this attribute.
- enableIPv6
- Controls
whether the GPFS daemon communicates
through the IPv6 network. The following values are valid:
- no
- Specifies that the GPFS daemon does not communicate through the IPv6 network. This is the default.
- yes
- Specifies that the GPFS daemon communicates through the IPv6 network. yes requires that the daemon be down on all nodes.
- prepare
- After the command completes, the daemons can be recycled on all nodes at a time chosen by the user (before proceeding to run the command with commit specified).
- commit
- Verifies that all currently active daemons have received the new value, allowing the user to add IPv6 nodes to the cluster.
Note: Before changing the value of enableIPv6, the GPFS daemon on the primary configuration server must be inactive. After changing the parameter, the GPFS daemon on the rest of nodes within the cluster should be recycled. This can be done one node a time.To use IPv6 addresses for GPFS, the operating system must be properly configured as IPv6 enabled, and IPv6 addresses must be configured on all the nodes within the cluster.
- enforceFilesetQuotaOnRoot
- Controls whether fileset quotas should be enforced for the root user the same way as for any other users. Valid values are yes or no. The default is no.
- expelDataCollectionDailyLimit
- Specifies
the maximum number of times that debug data associated with expelling
nodes can be collected in a 24-hour period. Sometimes exceptions are
made to help capture the most relevant debug data.
The default is 10. If the value is 0, then no expel-related debug data is collected.
- expelDataCollectionMinInterval
- Specifies
the minimum interval, in seconds, between two consecutive expel-related
data collection attempts on the same node.
The default is 120 seconds.
- failureDetectionTime
- Indicates to GPFS the amount of time it takes to detect that
a node has failed.
GPFS must be down on all the nodes when changing the failureDetectionTime attribute.
- fastestPolicyCmpThreshold
- Indicates
the disk comparison count threshold, above which GPFS forces selection of this disk as the preferred
disk to read and update its current speed.
Valid values are >= 3. The default is 50.
- fastestPolicyMaxValidPeriod
- Indicates
the time period after which the disk's current evaluation is considered
invalid (even if its comparison count has exceeded the threshold)
and GPFS prefers to read this
disk in the next selection to update its latest speed evaluation.
Valid values are >= 1 in seconds. The default is 600 (10 minutes).
- fastestPolicyMinDiffPercent
- A
percentage value indicating how GPFS selects
the fastest between two disks. For example, if you use the default
fastestPolicyMinDiffPercent value of 50, GPFS selects
a disk as faster only if it is 50% faster than the other. Otherwise,
the disks remain in the existing read order.
Valid values are 0 through 100 in percentage points. The default is 50.
- fastestPolicyNumReadSamples
- Controls
how many read samples are taken to evaluate the disk's recent speed.
Valid values are 3 through 100. The default is 5.
- fileHeatLossPercent
- Specifies the reduction rate of FILE_HEAT value for every fileHeatPeriodMinutes of file inactivity. The default value is 10.
- fileHeatPeriodMinutes
- Specifies the inactivity time before a file starts to lose FILE_HEAT value. The default value is 0, which means that FILE_HEAT is not tracked.
- FIPS1402mode
- Controls
whether GPFS operates in FIPS
140-2 mode, which requires using a FIPS-compliant encryption module
for all encryption and decryption activity. Valid values are yes or no.
The default value is no.
For FIPS 140-2 considerations, consult the "Encryption" topic in the IBM Spectrum Scale: Advanced Administration Guide.
- forceLogWriteOnFdatasync
- Controls
forcing log writes to disk. Valid values are yes or no.
When forceLogWriteOnFdatasync is set to yes, the GPFS log record is flushed to disk every time fdatasync() is invoked. This is the default.
When forceLogWriteOnFdatasync is set to no, the GPFS log record is flushed only when a new block is written to the file.
- lrocData
- Controls whether user data is populated into the local read-only cache. Other configuration options can be used to select the data that is eligible for the local read-only cache. When using more than one such configuration option, data that matches any of the specified criteria is eligible to be saved.
- Valid values are yes or no. The default value is yes.
- If lrocData is set to yes, by default the data that was not already in the cache when accessed by a user is subsequently saved to the local read-only cache. The default behavior can be overridden using the lrocDataMaxFileSize and lrocDataStubFileSize configuration options to save all data from small files or all data from the initial portion of large files.
- lrocDataMaxFileSize
- Limits the data that may be saved in the local read-only cache to only the data from small files.
- A value of -1 indicates that all data is eligible to be saved. A value of 0 indicates that small files are not to be saved. A positive value indicates the maximum size of a file to be considered for the local read-only cache. For example, a value of 32768 indicates that files with 32 KB of data or less are eligible to be saved in the local read-only cache. The default value is 0.
- lrocDataStubFileSize
- Limits the data that may be saved in the local read-only cache to only the data from the first portion of all files.
- A value of -1 indicates that all file data is eligible to be saved. A value of 0 indicates that stub data is not eligible to be saved. A positive value indicates that the initial portion of each file that is eligible is to be saved. For example, a value of 32768 indicates that the first 32 KB of data from each file is eligible to be saved in the local read-only cache. The default value is 0.
- lrocDirectories
- Controls whether directory blocks is populated into the local read-only cache. The option also controls other file system metadata such as indirect blocks, symbolic links, and extended attribute overflow blocks.
- Valid values are yes or no. The default value is yes.
- lrocInodes
- Controls whether inodes from open files is populated into the local read-only cache; the cache contains the full inode, including all disk pointers, extended attributes, and data.
- Valid values are yes or no. The default value is yes.
- maxblocksize
- Changes
the maximum file system block size. Valid block sizes are 64 KiB,
128 KiB, 256 KiB, 512 KiB, 1 MiB, 2 MiB, 4 MiB, 8 MiB, and 16 MiB.
The default maximum block size is 1 MiB. Specify this value with the
character K or M;
for example, use 2M to specify a block size
of 2 MiB.
File systems with block sizes larger than the specified value cannot be created or mounted unless the block size is increased.
GPFS must be down on all the nodes in the cluster when changing the maxblocksize attribute.
The -N flag is valid for this attribute.
- maxDownDisksForRecovery
- Specifies
the maximum number of disks that may experience a failure and still
be subject to an automatic recovery attempt. If this value is exceeded,
no automatic recovery actions take place.
Valid values are between 0 and 300. The default is 16. If restripeOnDiskFailure is no, maxDownDisksForRecovery has no effect.
- maxFailedNodesForRecovery
- Specifies
the maximum number of nodes that may be unavailable before automatic
disk recovery actions are cancelled.
Valid values are between 0 and 300. The default is 3. If restripeOnDiskFailure is no, maxFailedNodesForRecovery has no effect.
- maxFcntlRangesPerFile
- Specifies the number of fcntl locks that are allowed per file. The default is 200. The minimum value is 10 and the maximum value is 200000.
- maxFilesToCache
- Specifies
the number of inodes to cache for recently used files that have been
closed.
Storing the inode of a file in cache permits faster re-access to the file. The default is 4000, but increasing this number may improve throughput for workloads with high file reuse. However, increasing this number excessively may cause paging at the file system manager node. The value should be large enough to handle the number of concurrently open files plus allow caching of recently used files.
The -N flag is valid for this attribute.
- maxMBpS
- Specifies
an estimate of how many megabytes of data can be transferred per second
into or out of a single node. The default is 2048 MB per second. The
value is used in calculating the amount of I/O that can be done to
effectively prefetch data for readers and write-behind data from writers.
By lowering this value, you can artificially limit how much I/O one
node can put on all of the disk servers.
The -N flag is valid for this attribute.
- maxStatCache
- Specifies
the number of inodes to keep in the stat cache. The stat cache maintains
only enough inode information to perform a query on the file system.
If the user did not specify values for maxFilesToCache and maxStatCache,
the default value of maxFilesToCache is
4000 and the default value of maxStatCache is
1000. However, if the user specified a value for maxFilesToCache but
not for maxStatCache, the default value
of maxStatCache changes to 4*maxFilesToCache.
The -N flag is valid for this attribute.
Note: The stat cache is not effective on the Linux platform. Therefore, you need to set the maxStatCache attribute to a smaller value, such as 512, on that platform. - metadataDiskWaitTimeForRecovery
- Specifies
a period of time, in seconds, during which the recovery of metadata
disks is suspended to give the disk subsystem a chance to correct
itself. This parameter is taken into account when the affected disks
belong to a single failure group. If more than one failure group is
affected, the delay is based on the value of minDiskWaitTimeForRecovery.
Valid values are between 0 and 3600 seconds. The default is 2400. If restripeOnDiskFailure is no, metadataDiskWaitTimeForRecovery has no effect.
- minDiskWaitTimeForRecovery
- Specifies
a period of time, in seconds, during which the recovery of disks is
suspended to give the disk subsystem a chance to correct itself. This
parameter is taken into account when more than one failure group is
affected. If the affected disks belong to a single failure group,
the delay is based on the values of dataDiskWaitTimeForRecovery and metadataDiskWaitTimeForRecovery.
Valid values are between 0 and 3600 seconds. The default is 1800. If restripeOnDiskFailure is no, minDiskWaitTimeForRecovery has no effect.
- mmapRangeLock
- Specifies
POSIX or non-POSIX mmap byte-range semantics.
Valid values are yes or no (yes is
the default). A value of yes indicates POSIX
byte-range semantics apply to mmap operations.
A value of no indicates non-POSIX mmap byte-range
semantics apply to mmap operations.If using InterProcedural Analysis (IPA), turn this option off:
This will allow more lenient intranode locking, but impose internode whole file range tokens on files using mmap while writing.mmchconfig mmapRangeLock=no -i
- nistCompliance
- Controls
whether GPFS operates in the
NIST 800-131A mode. (This applies to security transport only, not
to encryption, as encryption always uses NIST-compliant mechanisms.)Valid values are:
- off
- Specifies that there is no compliance to NIST standards. For clusters operating below the GPFS 4.1 level, this is the default.
- SP800-131A
- Specifies that security transport is to follow the NIST SP800-131A recommendations. For clusters at the GPFS 4.1 level or higher, this is the default.
Note: In a remote cluster setup, all clusters must have the same nistCompliance value. - noSpaceEventInterval
- Specifies the time interval between calling a callback script of two noDiskSpace events of a file system. The default value is 120 seconds. If this value is set to zero, the noDiskSpace event is generated every time the file system encounters the noDiskSpace event. The noDiskSpace event is generated when a callback script is registered for this event with the mmaddcallback command.
- nsdBufSpace
- This
option specifies the percentage of the page pool reserved for the
network transfer of NSD requests. Valid values are within the range
of 10 to 70. The default value is 30. On GPFS Native
RAID recovery
group NSD servers, this value should be decreased to its minimum of
10, since vdisk-based NSDs are served directly from the RAID buffer
pool (as governed by nsdRAIDBufferPoolSizePct).
On all other NSD servers, increasing either this value or the amount
of page pool, or both, could improve NSD server performance. On NSD
client-only nodes, this parameter is ignored. For more information
about GPFS Native
RAID,
see IBM
Spectrum Scale RAID: Administration.
The -N flag is valid for this attribute.
- nsdRAIDTracks
- This
option specifies the number of tracks in the GPFS Native
RAID buffer
pool, or 0 if this node does not have a GPFS Native
RAID vdisk buffer
pool. This controls whether GPFS Native
RAID services
are configured. For more information about GPFS Native
RAID, see IBM
Spectrum Scale RAID: Administration.
Valid values are: 0; 256 or greater.
The -N flag is valid for this attribute.
- nsdRAIDBufferPoolSizePct
- This
option specifies the percentage of the page pool that is used for
the GPFS Native
RAID vdisk
buffer pool. Valid values are within the range of 10 to 90. The default
is 50 when GPFS Native
RAID is
configured on the node in question; 0 when it is not. For more information
about GPFS Native
RAID,
see IBM
Spectrum Scale RAID: Administration.
The -N flag is valid for this attribute.
- nsdServerWaitTimeForMount
- When
mounting a file system whose disks depend on NSD servers, this option
specifies the number of seconds to wait for those servers to come
up. The decision to wait is controlled by the criteria managed by
the nsdServerWaitTimeWindowOnMount option.
Valid values are between 0 and 1200 seconds. The default is 300. A value of zero indicates that no waiting is done. The interval for checking is 10 seconds. If nsdServerWaitTimeForMount is 0, nsdServerWaitTimeWindowOnMount has no effect.
The mount thread waits when the daemon delays for safe recovery. The mount wait for NSD servers to come up, which is covered by this option, occurs after expiration of the recovery wait allows the mount thread to proceed.
The -N flag is valid for this attribute.
- nsdServerWaitTimeWindowOnMount
- Specifies
a window of time (in seconds) during which a mount can wait for NSD
servers as described for the nsdServerWaitTimeForMount option.
The window begins when quorum is established (at cluster startup or
subsequently), or at the last known failure times of the NSD servers
required to perform the mount.
Valid values are between 1 and 1200 seconds. The default is 600. If nsdServerWaitTimeForMount is 0, nsdServerWaitTimeWindowOnMount has no effect.
The -N flag is valid for this attribute.
When a node rejoins the cluster after having been removed for any reason, the node resets all the failure time values that it knows about. Therefore, when a node rejoins the cluster it believes that the NSD servers have not failed. From the perspective of a node, old failures are no longer relevant.
GPFS checks the cluster formation criteria first. If that check falls outside the window, GPFS then checks for NSD server fail times being within the window.
- numaMemoryInterleave
- In
a Linux NUMA environment, the
default memory policy is to allocate memory from the local NUMA node
of the CPU from which the allocation request was made. This parameter
is used to change to an interleave memory policy for GPFS by starting GPFS with numactl
--interleave=all. This parameter should be used when
the GPFS memory usage needs
to be balanced across all NUMA nodes, such as the case when the size
of the GPFS page pool exceeds
the size of any one NUMA node.
Valid values are yes and no. The default is no.
Before using this parameter, ensure that the Linux numactl package has been installed.
- pagepool
- Changes
the size of the cache on each node. The default value is either one-third
of the physical memory on the node or 1G, whichever is smaller. This
applies to new installations only; on upgrades the existing default
value is kept.
The maximum GPFS page pool size depends on the value of the pagepoolMaxPhysMemPct parameter and the amount of physical memory on the node. You can specify this value with the suffix K, M, or G, for example, 128M.
The -N flag is valid for this attribute.
- pagepoolMaxPhysMemPct
- Percentage
of physical memory that can be assigned to the page pool. Valid values
are 10 through 90 percent. The default is 75 percent (with the exception
of Windows, where the default
is 50 percent).
The -N flag is valid for this attribute.
- pitWorkerThreadsPerNode
- Controls the maximum number of threads to be involved in parallel
processing on each node that is serving as a Parallel Inode Traversal
(PIT) worker.
By default, when a command that uses the PIT engine is run, the file system manager asks all nodes in the local cluster to serve as PIT workers; however, you can specify an exact set of nodes to serve as PIT workers by using the -N option of a PIT command. Note that the current file system manager node is a mandatory participant, even if it is not in the list of nodes you specify. On each participating node, up to pitWorkerThreadsPerNode can be involved in parallel processing. The range of accepted values is 0 to 8192. The default value varies within the 2-16 range, depending on the file system configuration.
- prefetchThreads
- Controls
the maximum possible number of threads dedicated to prefetching data
for files that are read sequentially, or to handle sequential write-behind.
Functions in the GPFS daemon dynamically determine the actual degree of parallelism for prefetching data. The default value is 72. The minimum value is 2. The maximum value of prefetchThreads plus worker1Threads plus nsdMaxWorkerThreads is 8192 on all 64-bit platforms.
The -N flag is valid for this attribute.
- profile
- Specifies a predefined profile of attributes to be applied. System-defined
profiles are located in /usr/lpp/mmfs/profiles/. All the configuration
attributes listed under a cluster stanza are changed as a result of
this command. The following system-defined profile names are accepted:
- gpfsProtocolDefaults
- gpfsProtocolRandomIO
A user's profiles must be installed in /var/mmfs/etc/. The profile file specifies GPFS configuration parameters with values different than the documented defaults. A user-defined profile must not begin with the string 'gpfs' and must have the .profile suffix.
User-defined profiles consist of the following stanzas:%cluster: [CommaSeparatedNodesOrNodeClasses:]ClusterConfigurationAttribute=Value ...
File system attributes and values are ignored.
A sample file can be found in /usr/lpp/mmfs/samples/sample.profile. See the mmchconfig command for a detailed description of the different configuration parameters. User-defined profiles should be used only by experienced administrators. When in doubt, use the mmchconfig command instead.
- readReplicaPolicy
- Specifies
the location from which the FPO policy is to read replicas. By default,
the FPO policy reads the first replica whether there is a replica
on the local disk or not. When readReplicaPolicy=local is
specified, the policy reads replicas from the local disk if the local
disk has data; for performance considerations, this is the recommended
setting for FPO environments. When readReplicaPolicy=fastest is
specified, the policy reads replicas from the disk considered the
fastest based on the read I/O statistics of the disk. You can tune
the way the system determines the fastest policy using the following
parameters:
- fastestPolicyNumReadSamples
- fastestPolicyCmpThreshold
- fastestPolicyMaxValidPeriod
- fastestPolicyMinDiffPercent
To return this attribute to the default setting, specify readReplicaPolicy=DEFAULT -i.
- release=LATEST
- Changes
the IBM
Spectrum Scale configuration
information to the latest format that is supported by the currently
installed level of the product. Perform this operation after you have
migrated all the nodes in the cluster to the latest level of the product. For
more information, see the topic Completing the migration to
a new level of GPFS in
the IBM
Spectrum Scale: Concepts,
Planning, and Installation Guide.
The command tries to access each node in the cluster to verify the level of the installed code. If the command cannot reach one or more nodes, you must rerun the command until it verifies the information for all the nodes.
- restripeOnDiskFailure
- Specifies
whether GPFS will attempt to
automatically recover from certain common disk failure situations.
When a disk experiences a failure and becomes unavailable, the recovery procedure will first attempt to restart the disk and if this fails, the disk is suspended and its data moved to other disks. Similarly, when a node joins the cluster, all disks for which the node is responsible are checked and an attempt is made to restart any that are in a down state.
Whether a file system is a subject of a recovery attempt is determined by the max replication values for the file system. If the mmlsfs -M or -R value is greater than one, then the recovery code is executed. The recovery actions are asynchronous and GPFS will continue its processing while the recovery attempts take place. The results from the recovery actions and any errors that are encountered is recorded in the GPFS logs.
- rpcPerfNumberDayIntervals
- Controls
the number of days that aggregated RPC data is saved. Every day the
previous 24 hours of one-hour RPC data is aggregated into a one-day
interval.
The default value for rpcPerfNumberDayIntervals is 30, which allows the previous 30 days of one-day intervals to be displayed. To conserve memory, fewer intervals can be configured to reduce the number of recent one-day intervals that can be displayed. The values that are allowed for rpcPerfNumberDayIntervals are in the range 4 - 60.
- rpcPerfNumberHourIntervals
- Controls
the number of hours that aggregated RPC data is saved. Every hour
the previous 60 minutes of one-minute RPC data is aggregated into
a one-hour interval.
The default value for rpcPerfNumberHourIntervals is 24, which allows the previous day's worth of one-hour intervals to be displayed. To conserve memory, fewer intervals can be configured to reduce the number of recent one-hour intervals that can be displayed. The values that are allowed for rpcPerfNumberHourIntervals are 4, 6, 8, 12, or 24.
- rpcPerfNumberMinuteIntervals
- Controls
the number of minutes that aggregated RPC data is saved. Every minute
the previous 60 seconds of one-second RPC data is aggregated into
a one-minute interval.
The default value for rpcPerfNumberMinuteIntervals is 60, which allows the previous hour's worth of one-minute intervals to be displayed. To conserve memory, fewer intervals can be configured to reduce the number of recent one-minute intervals that can be displayed. The values that are allowed for rpcPerfNumberMinuteIntervals are 4, 5, 6, 10, 12, 15, 20, 30, or 60.
- rpcPerfNumberSecondIntervals
- Controls
the number of seconds that aggregated RPC data is saved. Every second
RPC data is aggregated into a one-second interval.
The default value for rpcPerfNumberSecondIntervals is 60, which allows the previous minute's worth of one-second intervals to be displayed. To conserve memory, fewer intervals can be configured to reduce the number of recent one-second intervals that can be displayed. The values that are allowed for rpcPerfNumberSecondIntervals are 4, 5, 6, 10, 12, 15, 20, 30, or 60.
- rpcPerfRawExecBufferSize
- Specifies
the number of bytes to save in the buffer that stores raw RPC execution
statistics. For each RPC received by a node, 16 bytes of associated
data is saved in this buffer when the RPC completes. This circular
buffer must be large enough to hold one second's worth of raw execution
statistics.
The default value for rpcPerfRawExecBufferSize is 2M, which produces 131072 entries. Every second this data is processed, so the buffer should be 10% to 20% larger than what is needed to hold one second's worth of data.
- rpcPerfRawStatBufferSize
- Specifies
the number of bytes to save in the buffer that stores raw RPC performance
statistics. For each RPC sent to another node, 56 bytes of associated
data is saved in this buffer when the reply is received. This circular
buffer must be large enough to hold one second's worth of raw performance
statistics.
The default value for rpcPerfRawStatBufferSize is 6M, which produces 112347 entries. Every second this data is processed, so the buffer should be 10% to 20% larger than what is needed to hold one second's worth of data.
- sidAutoMapRangeLength
- Controls the length of the reserved range for Windows SID to UNIX ID mapping. See "Identity management on Windows" in the IBM Spectrum Scale: Advanced Administration Guide for additional information.
- sidAutoMapRangeStart
- Specifies the start of the reserved range for Windows SID to UNIX ID mapping. See "Identity management on Windows" in the IBM Spectrum Scale: Advanced Administration Guide for additional information.
- subnets
- Specifies
subnets used to communicate between nodes in a GPFS cluster or a remote GPFS cluster. The subnets option must use the following format:
where:subnets="Subnet[/ClusterName[;ClusterName...][ Subnet[/ClusterName[;ClusterName...]...]"
- ClusterName
- Can be either a cluster name or a shell-style regular expression,
which is used to match cluster names, such as:
- CL[23].kgn.ibm.com
- Matches CL2.kgn.ibm.com and CL3.kgn.ibm.com.
- CL[0-7].kgn.ibm.com
- Matches CL0.kgn.ibm.com, CL1.kgn.ibm.com, ... CL7.kgn.ibm.com.
- CL*.ibm.com
- Matches any cluster name that starts with CL and ends with .ibm.com.
- CL?.kgn.ibm.com
- Matches any cluster name that starts with CL, is followed by any one character, and then ends with .kgn.ibm.com.
This feature cannot be used to establish fault tolerance or automatic failover. If the interface corresponding to an IP address in the list is down, GPFS does not use the next one on the list. For more information about subnets, see the IBM Spectrum Scale: Advanced Administration Guide and search on Using remote access with public and private IP addresses.
Specifying a cluster name or a cluster name pattern for each subnet is only needed when a private network is shared across clusters. If the use of a private network is confined within the local cluster, then no cluster name is required in the subnet specification.
- systemLogLevel
- Specifies
the minimum severity level for messages sent to the system log. The
severity levels from highest to lowest priority are: alert, critical, error, warning, notice, configuration, informational, detail,
and debug. The value specified for this
attribute can be any severity level, or the value none can
be specified so no messages are sent to the system log. The default
value is error.
GPFS generates some critical log messages that are always sent to the system logging service. This attribute only affects messages originating in the GPFS daemon (mmfsd). Log messages originating in some administrative commands will only be stored in the GPFS log file.
This attribute is only valid for Linux nodes.
- tiebreakerDisks
- Controls
whether GPFS will use the node
quorum with tiebreaker algorithm in place of the regular node-based
quorum algorithm. See the IBM
Spectrum Scale: Concepts,
Planning, and Installation Guide and search
for "node quorum with tiebreaker". To enable this feature, specify
the names of one to three disks. Separate the NSD names with semicolon
(;) and enclose the list in quotes. The disks do not have to belong
to any particular file system, but must be directly accessible from
the quorum nodes. For example:
tiebreakerDisks="gpfs1nsd;gpfs2nsd;gpfs3nsd"
To disable this feature, use:tiebreakerDisks=no
Changing tiebreakerDisks is allowed while GPFS is running. However, if the traditional server-based (non-CCR) configuration repository is used, then when changing the tiebreakerDisks, GPFS must be down on all nodes in the cluster.
- uidDomain
- Specifies
the UID domain name for the cluster.
GPFS must be down on all the nodes when changing the uidDomain attribute.
See the IBM® white paper entitled UID Mapping for GPFS in a Multi-cluster Environment in IBM Knowledge Center.
- unmountOnDiskFail
- Controls
how the daemon responds when it detects a disk failure:
- yes
- The local node force-unmounts the file system that contains the
failed disk. Other file systems on the local node and all nodes in
the cluster continue to function normally, if they can. The local
node can remount the file system when the disk problem is resolved.
Use this setting in the following cases:
- You are using SAN-attached disks in large multinode configurations and you are not using replication.
- You have a node that hosts descOnly disks. See Establishing disaster recovery for your GPFS cluster in the IBM Spectrum Scale: Advanced Administration Guide.
- no
- The daemon marks the disk as failed, notifies all nodes that use this disk that it has failed, and continues as long as it can without using the disk. You can make the disk active again with the mmchdisk command. This setting is appropriate when the node is using metadata-and-data replication, because the cluster can work from the replica until the failed disk is active again.
- meta
- This option is like No except that the file system remains mounted unless it cannot access any replica of the metadata.
The -N flag is valid for this attribute.
- usePersistentReserve
- Specifies
whether to enable or disable Persistent Reserve (PR) on the disks.
Valid values are yes or no (no is
the default). GPFS must be stopped
on all nodes when setting this attribute. To enable PR and to obtain recovery performance improvements, your cluster requires a specific environment:
- All disks must be PR-capable.
- On AIX, all disks must be hdisks; on Linux, they must be generic (/dev/sd*) or DM-MP (/dev/dm-*) disks.
- If the disks have defined NSD servers, all NSD server nodes must be running the same operating system (AIX or Linux).
- If the disks are SAN-attached to all nodes, all nodes in the cluster must be running the same operating system (AIX or Linux).
For more information, see "Reduced recovery time using Persistent Reserve" in the IBM Spectrum Scale: Concepts, Planning, and Installation Guide.
- verbsPorts
- Specifies
the InfiniBand device names and port numbers used for RDMA transfers
between an NSD client and server. You must enable verbsRdma to
enable verbsPorts.The format for verbsPorts is:
In this format, Device is the HCA device name (such as mthca0 or mlx4_0); Port is the one-based port number (such as 1 or 2); and Fabric is a value to identify different InfiniBand (IB) fabrics (IB subnets on different switches).verbsPorts="Device/Port/Fabric[ Device/Port/Fabric ...]"
If you do not specify a port number, GPFS uses port 1 as the default. If the fabric number is not specified, the fabric number is 0.
For example:
will create two RDMA connections between the NSD client and server using both ports of a dual ported adapter with a fabric identifier 7 on port 1 and fabric identifier 8 on port 2.verbsPorts="mlx4_0/1/7 mlx4_0/2/8"
Another example, without the fabric number:
will create two RDMA connections between the NSD client and server using both ports of a dual ported adapter, with the fabric identifier defaulting to 0.verbsPorts="mthca0/1 mthca0/2"
A third example, without port or fabric number:
will use port 1 on each HCA device for the RDMA connections.verbsPorts="mlx4_0 mlx4_1"
The -N flag is valid for this attribute.
- verbsRdma
- Enables
or disables InfiniBand RDMA using the Verbs API for data transfers
between an NSD client and NSD server. Valid values are enable or disable.
The -N flag is valid for this attribute.
- verbsRdmaCm
- Enables
or disables the RDMA Connection Manager (RDMA CM or RDMA_CM) using
the RDMA_CM API for establishing connections between an NSD client
and NSD server. Valid values are enable or disable.
You must enable verbsRdma to enable verbsRdmaCm.
If RDMA CM is enabled for a node, the node will only be able to establish RDMA connections using RDMA CM to other nodes with verbsRdmaCm enabled. RDMA CM enablement requires IPoIB (IP over InfiniBand) with an active IP address for each port. Although IPv6 must be enabled, the GPFS implementation of RDMA CM does not currently support IPv6 addresses, so an IPv4 address must be used.
If verbsRdmaCm is not enabled when verbsRdma is enabled, the older method of RDMA connection will prevail.
The -N flag is valid for this attribute.
- verbsRdmaRoCEToS
Specifies the Type of Service (ToS) value for clusters using RDMA over Converged Ethernet (RoCE). Acceptable values for this parameter are 0, 8, 16, and 24. The default value is -1.
If the user-specified value is neither the default nor an acceptable value, the script will exit with an error message to indicate that no change has been made. However, a RoCE cluster will continue to operate with an internally set ToS value of 0 even if the mmchconfig command failed. Different ToS values can be set for different nodes or groups of nodes.
The -N flag is valid for this attribute.
- verbsRdmaSend
- Enables or disables the use of InfiniBand RDMA rather than TCP for most GPFS daemon-to-daemon communication. When disabled, only data transfers between an NSD client and NSD server are eligible for RDMA. Valid values are enable or disable. The default value is disable. The verbsRdma option must be enabled for verbsRdmaSend to have any effect.
- verbsRdmasPerConnection
- Sets the maximum number of simultaneous RDMA data transfer requests allowed per connection. The default value is zero; however, if this option is defaulted or set to zero, a value of 8 is used.
- verbsRdmasPerNode
- Sets the maximum number of simultaneous RDMA data transfer requests allowed per node. The default value is zero; however, if this option is defaulted or set to zero, a value equal to the nsdMaxWorkerThreads setting is used.
- verbsSendBufferMemoryMB
Sets the amount of page pool memory (in MiB) to reserve as dedicated buffer space for use by the verbsRdmaSend feature. If the value is unreasonably small or large (for example, larger than pagepool), the actual memory used is adjusted to a more appropriate value. If the value is zero (the default), a value is calculated based on the maximum number of RDMAs allowed per node (verbsRdmasPerNode). This option has no effect unless verbsRdmaSend is enabled.
- workerThreads
- Controls an integrated group of variables that tune file system
performance. Use this variable to tune file systems in environments
that are capable of high sequential or random read/write workloads
or small-file activity. For new installations of the product, this
variable is preferred over worker1Threads and preFetchThreads.Important: If you set workerThreads to a non-default value, do not set worker1Threads.
The default value is 48. The valid range is 1-8192. However, the maximum value of workerThreads plus preFetchThreads plus nsdMaxWorkerThreads is 8192. The -N flag is valid with this variable.
This variable controls both internal and external variables. The internal variables include maximum settings for concurrent file operations, for concurrent threads that flush dirty data and metadata, and for concurrent threads that prefetch data and metadata. You can further adjust the external variables with the mmchconfig command:- logBufferCount
- preFetchThreads
- worker3Threads
- worker1Threads
- Controls
the maximum number of concurrent file operations at any one instant.
If there are more requests than that, the excess will wait until a
previous request has finished. Important: If you set workerThreads to a non-default value, do not set worker1Threads.
This attribute is primarily used for random read or write requests that cannot be pre-fetched, random I/O requests, or small file activity. The default value is 48. The minimum value is 1. The maximum value of prefetchThreads plus worker1Threads plus nsdMaxWorkerThreads is 8192 on all 64-bit platforms.
The -N flag is valid for this attribute.
Exit status
- 0
- Successful completion.
- nonzero
- A failure has occurred.
Security
You must have root authority to run the mmchconfig command.
The node on which the command is issued must be able to execute remote shell commands on any other node in the cluster without the use of a password and without producing any extraneous messages. For more information, see Requirements for administering a file system in IBM Spectrum Scale: Administration and Programming Reference.
Examples
mmchconfig maxblocksize=4M
The
system displays information similar to:Verifying GPFS is stopped on all nodes ...
mmchconfig: Command successfully completed
mmchconfig: Propagating the cluster configuration data to all
affected nodes. This is an asynchronous process.
mmlsconfig
The
system displays information similar to: Configuration data for cluster ib.cluster:
------------------------------------------
clusterName ib.cluster
clusterId 13882433899463047326
autoload no
minReleaseLevel 4.1.0.0
dmapiFileHandleSize 32
maxblocksize 4M
pagepool 2g
[c21f1n18]
pagepool 5g
[common]
verbsPorts mthca0/1
verbsRdma enable
subnets 10.168.80.0
adminMode central
File systems in cluster ib.cluster:
-----------------------------------
/dev/fs1
See also
Location
/usr/lpp/mmfs/bin