mmadddisk command

Adds disks to a GPFS file system.

Synopsis

mmadddisk Device {"DiskDesc[;DiskDesc...]" | -F StanzaFile} [-a] [-r [--strict]]
  [-v {yes | no}] [-N {Node[,Node...] | NodeFile | NodeClass}]
  [--qos QOSClass]

Availability

Available on all IBM Spectrum Scale editions.

Description

Use the mmadddisk command to add disks to a GPFS file system. When the -r flag is specified, the command rebalances an existing file system after it adds the disks. The command does not require the file system to be mounted. The file system can be in use.

The actual number of disks in your file system might be constrained by products other than GPFS that you installed. See to the individual product documentation.

To add disks to a GPFS file system, first decide which of the following two tasks you want to perform:

  1. Create new disks with the mmcrnsd command.

    In this case, you must also decide whether to create a new set of NSD and pools stanzas or use the rewritten NSD and pool stanzas that the mmcrnsd command produces. In a rewritten file, the disk usage, failure group, and storage pool values are the same as the values that are specified in the mmcrnsd command.

  2. Select disks no longer in use in any file system. To display the disks that are not in use, run the following command:
    mmlsnsd -F

Earlier versions of the product allowed specifying disk information with colon-separated disk descriptors. Those disk descriptors are no longer supported.

Note: If mmadddisk fails with a NO_SPACE error, try one of the following actions:
  • Rebalance the file system.
  • Run the command mmfsck -y to deallocate unreferenced subblocks.
  • Create a pool with larger disks and move data from the old pool to the new one.

Parameters

Device
The device name of the file system to which the disks are added. File system names need not be fully qualified. fs0 is as acceptable as /dev/fs0.

This parameter must be first.

DiskDesc
A descriptor for each disk to be added. Each descriptor is delimited by a semicolon (;) and the entire list must be enclosed in quotation marks (' or "). The use of disk descriptors is discouraged.
-F StanzaFile
Specifies a file that contains the NSD stanzas and pool stanzas for the disks to be added to the file system.
NSD stanzas have the following format:
%nsd: 
  nsd=NsdName
  usage={dataOnly | metadataOnly | dataAndMetadata | descOnly}
  failureGroup=FailureGroup
  pool=StoragePool
  servers=ServerList
  device=DiskName
  Start of changethinDiskType={no | nvme | scsi | auto}End of change

where:

nsd=NsdName
The name of an NSD previously created by the mmcrnsd command. For a list of available disks, run the mmlsnsd -F command. This clause is mandatory for the mmadddisk command.
usage={dataOnly | metadataOnly | dataAndMetadata | descOnly}
Specifies the type of data to be stored on the disk:
dataAndMetadata
Indicates that the disk contains both data and metadata. This value is the default for disks in the system pool.
dataOnly
Indicates that the disk contains data and does not contain metadata. This value is the default for disks in storage pools other than the system pool.
metadataOnly
Indicates that the disk contains metadata and does not contain data.
descOnly
Indicates that the disk contains no data and no file metadata. IBM Spectrum Scale uses this type of disk primarily to keep a copy of the file system descriptor. It can also be used as a third failure group in certain disaster recovery configurations. For more information, see Synchronous mirroring with GPFS replication.
failureGroup=FailureGroup
Identifies the failure group to which the disk belongs. A failure group identifier can be a simple integer or a topology vector that consists of up to three comma-separated integers. The default is -1, which indicates that the disk has no point of failure in common with any other disk.

GPFS uses this information during data and metadata placement to ensure that no two replicas of the same block can become unavailable due to a single failure. All disks that are attached to the same NSD server or adapter must be placed in the same failure group.

If the file system is configured with data replication, all storage pools must have two failure groups to maintain proper protection of the data. Similarly, if metadata replication is in effect, the system storage pool must have two failure groups.

Disks that belong to storage pools in which write affinity is enabled can use topology vectors to identify failure domains in a shared-nothing cluster. Disks that belong to traditional storage pools must use simple integers to specify the failure group.

pool=StoragePool
Specifies the storage pool to which the disk is to be assigned. If this name is not provided, the default is system.

Only the system storage pool can contain metadataOnly, dataAndMetadata, or descOnly disks. Disks in other storage pools must be dataOnly.

servers=ServerList
A comma-separated list of NSD server nodes. This clause is ignored by the mmadddisk command.
device=DiskName
The block device name of the underlying disk device. This clause is ignored by the mmadddisk command.
Start of changethinDiskType={no | nvme | scsi | auto}End of change
Start of changeSpecifies the thin provisioned disk type:
Note: Start of changeBy default the system pool cannot contain both regular disks and thin provisioned disks. If you want to include both types of disk in the system pool, contact IBM Service for more information.End of change
no
The disk is not thin provisioned. This value is the default.
nvme
The disk is a thin provisioned NVMe device that supports the mmreclaimspace command.
scsi
The disk is a thin provisioned SCSI disk that supports the mmreclaimspace command.
auto
The type of the disk is either nvme or scsi. IBM Spectrum Scale will try to detect the actual disk type automatically. To avoid problems, you should replace auto with the correct disk type, nvme or scsi, as soon as you can.
For more information, see IBM Spectrum Scale with data reduction storage devices.End of change
Note: An NSD belonging to a tiebreaker disk is not allowed to be added to a file system if NSD format conversion is required.
Pool stanzas have this format:
%pool: 
  pool=StoragePoolName
  blockSize=BlockSize
  usage={dataOnly | metadataOnly | dataAndMetadata}
  layoutMap={scatter | cluster}
  allowWriteAffinity={yes | no}
  writeAffinityDepth={0 | 1 | 2}
  blockGroupFactor=BlockGroupFactor

where:

pool=StoragePoolName
Is the name of a storage pool.
blockSize=BlockSize
Specifies the block size of the disks in the storage pool.
usage={dataOnly | metadataOnly | dataAndMetadata}
Specifies the type of data to be stored in the storage pool:
dataAndMetadata
Indicates that the disks in the storage pool contain both data and metadata. This is the default for disks in the system pool.
dataOnly
Indicates that the disks contain data and do not contain metadata. This is the default for disks in storage pools other than the system pool.
metadataOnly
Indicates that the disks contain metadata and do not contain data.
layoutMap={scatter | cluster}
Specifies the block allocation map type. When allocating blocks for a given file, GPFS first uses a round-robin algorithm to spread the data across all disks in the storage pool. After a disk is selected, the location of the data block on the disk is determined by the block allocation map type. If cluster is specified, GPFS attempts to allocate blocks in clusters. Blocks that belong to a particular file are kept adjacent to each other within each cluster. If scatter is specified, the location of the block is chosen randomly.

The cluster allocation method may provide better disk performance for some disk subsystems in relatively small installations. The benefits of clustered block allocation diminish when the number of nodes in the cluster or the number of disks in a file system increases, or when the file system's free space becomes fragmented. The cluster allocation method is the default for GPFS clusters with eight or fewer nodes and for file systems with eight or fewer disks.

The scatter allocation method provides more consistent file system performance by averaging out performance variations due to block location (for many disk subsystems, the location of the data relative to the disk edge has a substantial effect on performance). This allocation method is appropriate in most cases and is the default for GPFS clusters with more than eight nodes or file systems with more than eight disks.

The block allocation map type cannot be changed after the storage pool has been created.

allowWriteAffinity={yes | no}
Indicates whether the File Placement Optimizer (FPO) feature is to be enabled for the storage pool. For more information on FPO, see File Placement Optimizer
writeAffinityDepth={0 | 1 | 2}
Specifies the allocation policy to be used by the node writing the data.

A write affinity depth of 0 indicates that each replica is to be striped across the disks in a cyclical fashion with the restriction that no two disks are in the same failure group. By default, the unit of striping is a block; however, if the block group factor is specified in order to exploit chunks, the unit of striping is a chunk.

A write affinity depth of 1 indicates that the first copy is written to the writer node. The second copy is written to a different rack. The third copy is written to the same rack as the second copy, but on a different half (which can be composed of several nodes).

A write affinity depth of 2 indicates that the first copy is written to the writer node. The second copy is written to the same rack as the first copy, but on a different half (which can be composed of several nodes). The target node is determined by a hash value on the fileset ID of the file, or it is chosen randomly if the file does not belong to any fileset. The third copy is striped across the disks in a cyclical fashion with the restriction that no two disks are in the same failure group. The following conditions must be met while using a write affinity depth of 2 to get evenly allocated space in all disks:
  1. The configuration in disk number, disk size, and node number for each rack must be similar.
  2. The number of nodes must be the same in the bottom half and the top half of each rack.

This behavior can be altered on an individual file basis by using the --write-affinity-failure-group option of the mmchattr command.

This parameter is ignored if write affinity is disabled for the storage pool.

blockGroupFactor=BlockGroupFactor
Specifies how many file system blocks are laid out sequentially on disk to behave like a single large block. This option only works if --allow-write-affinity is set for the data pool. This applies only to a new data block layout; it does not migrate previously existing data blocks.

See File Placement Optimizer

-a
Specifies asynchronous processing. If this flag is specified, the mmadddisk command returns after the file system descriptor is updated and the rebalancing scan is started; it does not wait for rebalancing to finish. If no rebalancing is requested (the -r flag not specified), this option has no effect.
-r
Rebalances the file system to improve performance. Rebalancing attempts to distribute file blocks evenly across the disks of the file system. In IBM Spectrum Scale 5.0.0 and later, rebalancing is implemented by a lenient round-robin method that typically runs faster than the previous method of strict round robin. To rebalance the file system using the strict round-robin method, include the --strict option that is described in the following text.
--strict
Rebalances the specified files with a strict round-robin method. In IBM Spectrum Scale v4.2.3 and earlier, rebalancing always uses this method.
Note: Rebalancing of files is an I/O intensive and time-consuming operation and is important only for file systems with large files that are mostly invariant. In many cases, normal file update and creation rebalances a file system over time without the cost of a complete rebalancing.
Note: Rebalancing distributes file blocks across all the disks in the cluster that are not suspended, including stopped disks. For stopped disks, rebalancing does not allow read operations and allocates data blocks without writing them to the disk. When the disk is restarted and replicated data is copied onto it, the file system completes the write operations.
-v {yes | no}
Verify that specified disks do not belong to an existing file system. The default is -v yes. Specify -v no only when you want to reuse disks that are no longer needed for an existing file system. If the command is interrupted for any reason, use the -v no option on the next invocation of the command.
Important: Using -v no on a disk that already belongs to a file system corrupts that file system. This problem is not detected until the next time that file system is mounted.
-N {Node[,Node...] | NodeFile | NodeClass}
Specifies the nodes that are to participate in the restriping of the file system after the specified disks are available for use by GPFS. This parameter must be used with the -r option. This command supports all defined node classes. The default is all or the current value of the defaultHelperNodes parameter of the mmchconfig command.

For general information on how to specify node names, see Specifying nodes as input to GPFS commands.

--qos QOSClass
Specifies the Quality of Service for I/O operations (QoS) class to which the instance of the command is assigned. If you do not specify this parameter, the instance of the command is assigned by default to the maintenance QoS class. This parameter has no effect unless the QoS service is enabled. For more information, see the topic mmchqos command. Specify one of the following QoS classes:
maintenance
This QoS class is typically configured to have a smaller share of file system IOPS. Use this class for I/O-intensive, potentially long-running GPFS commands, so that they contribute less to reducing overall file system performance.
other
This QoS class is typically configured to have a larger share of file system IOPS. Use this class for administration commands that are not I/O-intensive.
For more information, see the topic Setting the Quality of Service for I/O operations (QoS).

Exit status

0
Successful completion.
nonzero
A failure occurred.

Security

You must have root authority to run the mmadddisk command.

The node on which the command is issued must be able to execute remote shell commands on any other node in the cluster without the use of a password and without producing any extraneous messages. For more information, see Requirements for administering a GPFS file system.

Examples

  1. Assume that the file ./newNSDstanza contains the following NSD stanza:
    %nsd: nsd=gpfs10nsd
      servers=k148n07,k148n06
      usage=dataOnly
      failureGroup=5
      pool=pool2Start of change
      thinDiskType=nvmeEnd of change
    To add the disk that is defined in this stanza, run the following command:
    mmadddisk fs1 -F ./newNSDstanza -r
    The command displays information like the following example:
    GPFS: 6027-531 The following disks of fs1 will be formatted on node
     k148n07.kgn.ibm.com:
        gpfs10nsd: size 2202 MB
    Extending Allocation Map
    Creating Allocation Map for storage pool 'pool2'
      75 % complete on Thu Feb 16 13:57:52 2006
     100 % complete on Thu Feb 16 13:57:54 2006
    Flushing Allocation Map for storage pool 'pool2'
    GPFS: 6027-535 Disks up to size 24 GB can be added to storage
    pool pool2.
    Checking allocation map for storage pool system
      62 % complete on Thu Feb 16 13:58:03 2006
     100 % complete on Thu Feb 16 13:58:06 2006
    Checking allocation map for storage pool pool1
      62 % complete on Thu Feb 16 13:58:11 2006
     100 % complete on Thu Feb 16 13:58:14 2006
    Checking allocation map for storage pool pool2
      63 % complete on Thu Feb 16 13:58:19 2006
     100 % complete on Thu Feb 16 13:58:22 2006
    GPFS: 6027-1503 Completed adding disks to file system fs1.
    mmadddisk: 6027-1371 Propagating the cluster configuration data to all
      affected nodes.  This is an asynchronous process.
    Restriping fs1 ...
    GPFS: 6027-589 Scanning file system metadata, phase 1 ... 
    GPFS: 6027-552 Scan completed successfully.
    GPFS: 6027-589 Scanning file system metadata, phase 2 ... 
    GPFS: 6027-552 Scan completed successfully.
    GPFS: 6027-589 Scanning file system metadata, phase 3 ... 
    GPFS: 6027-552 Scan completed successfully.
    GPFS: 6027-589 Scanning file system metadata, phase 4 ... 
    GPFS: 6027-552 Scan completed successfully.
    GPFS: 6027-565 Scanning user file metadata ...
      68 %  complete on Thu Feb 16 13:59:06 2006
     100 %  complete on Thu Feb 16 13:59:07 2006
    GPFS: 6027-552 Scan completed successfully.
    Done

See also

Location

/usr/lpp/mmfs/bin