mmrestripefs command

Rebalances or restores the replication factor of all the files in a file system. Alternatively, this command performs any incomplete or deferred file compression or decompression of all the files in a file system.

Synopsis

mmrestripefs Device {-m | -r | -b [--strict] | -R | -c [--read-only]  | -p | -z}
                    [-N {Node[,Node...] | NodeFile | NodeClass}] [-o InodeResultFile]
                    [-P PoolName] [--inode-criteria CriteriaFile] [--qos QOSClass]
or
mmrestripefs Device {-r | -b [--strict] | -R | -c [--read-only]} --metadata-only
                    [-N {Node[,Node...] | NodeFile | NodeClass}] [-o InodeResultFile]
                    [--inode-criteria CriteriaFile] [--qos QOSClass]

Availability

Available on all IBM Spectrum Scale editions.

Description

Issue the mmrestripefs command to rebalance or restore the replication of all files in a file system. The command moves existing file system data between different disks in the file system based on changes to the disk state made by the mmchdisk, mmadddisk, and mmdeldisk commands. It also attempts to restore the metadata or data replication of all the files in the file system.
Tip: The mmrestripefs command can take a long time to run if there are many files or a large amount of data to rebalance or restore. If you are adding, deleting, or replacing multiple disks at the same time (mmadddisk, mmdeldisk, or mmrpldisk) you can run the mmrestripefs command after you have added, deleted, or replaced all the disks, rather than after each disk.

Alternatively, you can issue the mmrestripefs command to perform any deferred or incomplete file compression or decompression in all the files of a file system.

You must specify one of the options (-m, -r, -b, -R, -c, -p, or -z) to indicate how much file system data to move or whether to perform file compression or decompression. You can issue this command against a mounted or unmounted file system.

If the file system uses replication, then restriping the file system also replicates it. Also, if the file system uses replication the -r option and the -m options treat suspended disks differently. The -r option removes all data from a suspended disk. But the -m option leaves data on a suspended disk if at least one replica of the data remains on a disk that is not suspended.

The -b option performs all the operations of the -m and -r options.

Use the -z option to perform any deferred or incomplete file compression or decompression.

CAUTION:
Do not issue the mmrestripefs or mmrestripefile command while an mmrestorefs command is running.

Consider the necessity of restriping and the current demands on the system. New data that is added to the file system is correctly striped. Restriping a large file system requires many insert and delete operations and might affect system performance. Plan to perform this task when system demand is low.

Parameters

Device
The device name of the file system to be restriped. File system names need not be fully qualified.
Device must be the first parameter. It can take the following parameters:
-m
Migrates all critical data off of any suspended disk in this file system. Critical data is all data that would be lost if currently suspended disks were removed.
-r
Migrates all data off suspended disks. It also restores all replicated files in the file system to their designated degree of replication when a previous disk failure or removal of a disk makes some replica data inaccessible. Use this parameter either immediately after a disk failure to protect replicated data against a subsequent failure, or before you take a disk offline for maintenance to protect replicated data against failure of another disk during the maintenance process.
Note: Start of changeIf the file system uses replication, before running mmrestripefs Device -r, you should run mmlsdisk Device -L to check the number of failure groups available. If the number of failure groups available is less than your default replication, you should not run mmrestripefs Device -r because it will remove the data replica for files that have replica located on suspended or to be emptied disks.End of change
-b [--strict]
Rebalances the file system to improve performance. Rebalancing attempts to distribute file blocks evenly across the disks of the file system. In IBM Spectrum Scale 5.0.0 and later, rebalancing is implemented by a lenient round-robin method that typically runs faster than the previous method of strict round robin. To rebalance the file system using the strict round-robin method, include the --strict option that is described in the following text.
--strict
Rebalances the file system with a strict round-robin method. In IBM Spectrum Scale v4.2.3 and earlier, rebalancing always uses this method.
Note: Rebalancing of files is an I/O intensive and time-consuming operation and is important only for file systems with large files that are mostly invariant. In many cases, normal file update and creation rebalances a file system over time without the cost of a complete rebalancing.
Note: Rebalancing distributes file blocks across all the disks in the cluster that are not suspended, including stopped disks. For stopped disks, rebalancing does not allow read operations and allocates data blocks without writing them to the disk. When the disk is restarted and replicated data is copied onto it, the file system completes the write operations.
-R
Changes the replication settings of each file, directory, and system metadata object so that they match the default file system settings (see the mmchfs command -m and -r options) as long as the maximum (-M and -R) settings for the object allow it. Next, it replicates or unreplicates the object as needed to match the new settings. This option can be used to replicate all of the existing files that were not previously replicated or to unreplicate the files if replication is no longer needed or wanted. All data is also migrated off disks that have either a suspended or to be emptied status.
-c
Scans the file system and compares replicas of metadata and data for conflicts. When conflicts are found, the -c option attempts to fix the replicas.
--read-only
Modifies the -c option so that it does not try to fix conflicting replicas. You can use this option only with the -c option.
--metadata-only
Limits the specified operation to metadata blocks. Data blocks are not affected. This option is valid only with the -r, -b, -R, or -c option.
The mmrestripefs command with this option completes its operation quicker than a full restripe, replication, or replica compare of data and metadata.
Use this option when you want to prioritize the mmrestripefs operation on the metadata. This option ensures that the mmrestripefs operation has a reduced impact on the file system performance when compared to running the mmrestripefs command on the metadata and data.
After running the mmrestripefs command on the metadata with --metadata-only option, you can issue the mmrestripefs command without this option to restripe the data and any metadata that requires to be restriped.
Note: This option does not run until all the nodes in the cluster are upgraded to IBM Spectrum Scale 4.2.1 release. If any of the nodes is not upgraded, the system displays the following error message:
mmrestripefs: The --metadata-only option support has not been enabled yet.
Issue "mmchconfig release=LATEST" to activate the new function. 
mmrestripefs: Command failed. Examine previous error messages to determine cause.
-p
Directs mmrestripefs to repair the file placement within the storage pool.

Files that are assigned to one storage pool, but with data in a different pool, have their data migrated to the correct pool. Such files are referred to as ill-placed. Utilities, such as the mmchattr command, might change a file's storage pool assignment, but not move the data. The mmrestripefs command might then be invoked to migrate all of the data at once, rather than migrating each file individually. The placement option (-p) rebalances only the files that it moves. In contrast, the rebalance operation (-b) performs data placement on all files.

-z
Performs any deferred or incomplete file compression or decompression of files in the file system. For more information, see the topic File compression.
-P PoolName
Directs mmrestripefs to repair only files assigned to the specified storage pool. This option is convenient for migrating ill-placed data blocks between pools, for example after you change a file's storage pool assignment with mmchattr or mmapplypolicy with the -I defer flag.

Do not use for other tasks, in particular, for any tasks that require metadata processing, such as re-replication. By design, all GPFS metadata is kept in the system pool, even for files that have blocks in other storage pools. Therefore a command that must process all metadata must not be restricted to a specific storage pool.

-N {Node[,Node...] | NodeFile | NodeClass}
Specify the nodes that participate in the restripe of the file system. This command supports all defined node classes. The default is all or the current value of the defaultHelperNodes parameter of the mmchconfig command.

For general information on how to specify node names, see Specifying nodes as input to GPFS commands.

-o InodeResultFile
Contains a list of the inodes that met the interesting inode flags that were specified on the --inode-criteria parameter. The output file contains the following:
INODE_NUMBER
This is the inode number.
DISKADDR
Specifies a dummy address for later tsfindinode use.
SNAPSHOT_ID
This is the snapshot ID.
ISGLOBAL_SNAPSHOT
Indicates whether or not the inode is in a global snapshot. Files in the live file system are considered to be in a global snapshot.
INDEPENDENT_FSETID
Indicates the independent fileset to which the inode belongs.
MEMO (INODE_FLAGS FILE_TYPE [ERROR])
Indicates the inode flag and file type that will be printed:
Inode flags:
BROKEN
exposed
dataUpdateMiss
illCompressed
illPlaced      
illReplicated
metaUpdateMiss  
unbalanced 
File types:
BLK_DEV 
CHAR_DEV 
DIRECTORY
FIFO
LINK
LOGFILE
REGULAR_FILE
RESERVED
SOCK
*UNLINKED*
*DELETED*
Notes:
  1. An error message will be printed in the output file if an error is encountered when repairing the inode.
  2. DISKADDR, ISGLOBAL_SNAPSHOT, and FSET_ID work with the tsfindinode tool (/usr/lpp/mmfs/bin/tsfindinode) to find the file name for each inode. tsfindinode uses the output file to retrieve the file name for each interesting inode.
--inode-criteria CriteriaFile
Specifies the interesting inode criteria flag, where CriteriaFile is one of the following:
BROKEN
Indicates that a file has a data block with all of its replicas on disks that have been removed.
Note: BROKEN is always included in the list of flags even if it is not specified.
dataUpdateMiss
Indicates that at least one data block was not updated successfully on all replicas.
exposed
Indicates an inode with an exposed risk; that is, the file has data where all replicas are on suspended disks. This could cause data to be lost if the suspended disks have failed or been removed.
illCompressed
Indicates an inode in which file compression or decompression is deferred, or in which a compressed file is partly decompressed to allow the file to be written into or memory-mapped.
illPlaced
Indicates an inode with some data blocks that might be stored in an incorrect storage pool.
illReplicated
Indicates that the file has a data block that does not meet the setting for the replica.
metaUpdateMiss
Indicates that there is at least one metadata block that has not been successfully updated to all replicas.
unbalanced
Indicates that the file has a data block that is not well balanced across all the disks in all failure groups.
Note: If a file matches any of the specified interesting flags, all of its interesting flags (even those not specified) will be displayed.
--qos QOSClass
Specifies the Quality of Service for I/O operations (QoS) class to which the instance of the command is assigned. If you do not specify this parameter, the instance of the command is assigned by default to the maintenance QoS class. This parameter has no effect unless the QoS service is enabled. For more information, see the topic mmchqos command. Specify one of the following QoS classes:
maintenance
This QoS class is typically configured to have a smaller share of file system IOPS. Use this class for I/O-intensive, potentially long-running GPFS commands, so that they contribute less to reducing overall file system performance.
other
This QoS class is typically configured to have a larger share of file system IOPS. Use this class for administration commands that are not I/O-intensive.
For more information, see the topic Setting the Quality of Service for I/O operations (QoS).

Exit status

0
Successful completion.
nonzero
A failure has occurred.

Security

You must have root authority to issue the mmrestripefs command.

The node on which you issue the command must be able to execute remote shell commands on any other node in the cluster without the use of a password and without producing any extraneous messages. For more information, see Requirements for administering a GPFS file system.

Examples

  1. To move all critical data from any suspended disk in file system fs1, issue the following command:
    mmrestripefs fs1 -m
    The system displays information similar to the following output:
    GPFS: 6027-589 Scanning file system metadata, phase 1 ...
    GPFS: 6027-552 Scan completed successfully.
    GPFS: 6027-589 Scanning file system metadata, phase 2 ...
    GPFS: 6027-552 Scan completed successfully.
    GPFS: 6027-589 Scanning file system metadata, phase 3 ...
    GPFS: 6027-552 Scan completed successfully.
    GPFS: 6027-589 Scanning file system metadata, phase 4 ...
    GPFS: 6027-552 Scan completed successfully.
    GPFS: 6027-565 Scanning user file metadata ...
    8.00 % complete on Tue Feb 24 16:56:55 2009 ( 708608 inodes 346 MB)
    100.00 % complete on Tue Feb 24 16:56:56 2009
    GPFS: 6027-552 Scan completed successfully.
  2. To rebalance all files in file system fs1 across all defined, accessible disks that are not stopped or suspended, issue the following command:
    mmrestripefs fs1 -b
    The system displays information similar to the following output:
    GPFS: 6027-589 Scanning file system metadata, phase 1 ...
    GPFS: 6027-552 Scan completed successfully.
    GPFS: 6027-589 Scanning file system metadata, phase 2 ...
    GPFS: 6027-552 Scan completed successfully.
    GPFS: 6027-589 Scanning file system metadata, phase 3 ...
    GPFS: 6027-552 Scan completed successfully.
    GPFS: 6027-589 Scanning file system metadata, phase 4 ...
    GPFS: 6027-552 Scan completed successfully.
    GPFS: 6027-565 Scanning user file metadata ...
    3.00 % complete on Tue Feb 24 16:56:39 2009 ( 180224 inodes 161 MB)
    100.00 % complete on Tue Feb 24 16:56:44 2009
    GPFS: 6027-552 Scan completed successfully.
  3. To compare and fix replica conflicts of metadata and data in file system gpfs1, issue the following command:
    mmrestripefs gpfs1 -c
    The system displays information similar to the following output:
    Scanning file system metadata, phase 1 ...
    Inode 0 in fileset 0 and snapshot 0 has mismatch in replicated disk address 2:104859136
    Scan completed successfully.
    Scanning file system metadata, phase 2 ...
    Scan completed successfully.
    Scanning file system metadata, phase 3 ...
    Scan completed successfully.
    Scanning file system metadata, phase 4 ...
    Scan completed successfully.
    Scanning user file metadata ...
     100.00 % complete on Tue Jul 30 03:32:44 2013
    Scan completed successfully.
  4. To fix the pool placement of files in file system fs1 and also determine which files are illReplicated (for example, as a result of a failed disk), issue the following command:
    mmrestripefs fs1 -p --inode-criteria /tmp/crit -o /tmp/inodeResultFile
    The system displays information similar to the following output:
    GPFS: 6027-589 Scanning file system metadata, phase 1 ... 
    GPFS: 6027-552 Scan completed successfully.
    GPFS: 6027-589 Scanning file system metadata, phase 2 ... 
    Scanning file system metadata for data storage pool
    GPFS: 6027-552 Scan completed successfully.
    GPFS: 6027-589 Scanning file system metadata, phase 3 ... 
    GPFS: 6027-552 Scan completed successfully.
    GPFS: 6027-589 Scanning file system metadata, phase 4 ... 
    GPFS: 6027-552 Scan completed successfully.
    GPFS: 6027-565 Scanning user file metadata ...
     100.00 % complete on Wed Apr 15 10:15:15 2015  (65792 inodes with total  400 MB data processed)
    GPFS: 6027-552 Scan completed successfully.
    GPFS: 6027-3902 Check file '/tmp/inodeResultFile' on vmip1 for inodes that were \
                                found matching the criteria.
    #10:15:15# vmip1:/fs1 # cat /tmp/crit
    illReplicated
    #10:15:19# vmip1:/fs1 # cat /tmp/inodeResultFile
    This inode list was generated in the Parallel Inode Traverse on Wed Apr 15 10:15:14 2015
    INODE_NUMBER DISKADDR SNAPSHOT_ID ISGLOBAL_SNAPSHOT FSET_ID MEMO(INODE_FLAGS FILE_TYPE [ERROR])
     24320        0:0      0           1                 0       illreplicated unbalanced REGULAR_FILE
     24322        0:0      0           1                 0       illreplicated unbalanced REGULAR_FILE
     24321        0:0      0           1                 0       illreplicated unbalanced REGULAR_FILE
     24324        0:0      0           1                 0       illreplicated unbalanced REGULAR_FILE
     24325        0:0      0           1                 0       illreplicated unbalanced REGULAR_FILE
     24323        0:0      0           1                 0       illreplicated unbalanced REGULAR_FILE
     24326        0:0      0           1                 0       illreplicated unbalanced REGULAR_FILE
     24327        0:0      0           1                 0       illreplicated unbalanced REGULAR_FILE
     24328        0:0      0           1                 0       illreplicated unbalanced REGULAR_FILE
     24329        0:0      0           1                 0       illreplicated unbalanced REGULAR_FILE

Location

/usr/lpp/mmfs/bin