Determining how long mmrestripefs takes to complete

Several factors determine how long the mmrestripefs command takes to complete.

To determine how long the mmrestripefs command takes to complete, consider these points:
  1. The amount of data that potentially needs to be moved. You can estimate this value by issuing the df command.
  2. The number of IBM Spectrum Scale™ client nodes that are available to do the work.
  3. The amount of Network Shared Disk (NSD) server bandwidth that is available for I/O operations.
  4. The quality of service for I/O operations (QoS) settings on each node. For more information, see mmchqos command.
  5. The maximum number of PIT threads on each node. For more information, see the description of the pitWorkerThreadsPerNode attribute in mmchconfig command.
  6. The amount of free space that is available from new disks. If you added new disks, issue the mmdf command to determine the amount of additional free space that is available.

The restriping of a file system is done by having multiple threads on each node in the cluster work on a subset of files. If the files are large, multiple nodes can participate in restriping it in parallel. So, the more GPFS client nodes that are performing work for the restripe operation, the faster the mmrestripefs command completes. Use the -N parameter to specify the nodes to participate in the restripe operation. Based on raw I/O rates, you can estimate the length of time for the restripe operation. However, because of the need to scan metadata, double that value.

Assuming that enough nodes are available to saturate the disk servers and assuming that all the data must be moved, the time to read and write every block of data is roughly:
  2 * fileSystemSize / averageDiskserverDataRate

As an upper bound, because of the need to scan all of the metadata, double this time. If other jobs are loading the NSD servers heavily, this time might increase even more.

Note: You do not need to stop all other jobs while the mmrestripefs command is running. The CPU load of the command is minimal on each node and only the files that are being restriped at any moment are locked to maintain data integrity.