Recovery group server maintenance

It is sometimes necessary to temporarily take an IBM Storage Scale RAID recovery group server off-line for software or hardware maintenance, while at the same time preserving recovery group access.

For the two servers of paired recovery groups, this is accomplished using the mmvdisk recoverygroup change --active command to make one server the active server for both recovery groups of the pair. Maintenance can then be performed on the non-active server. Since the active server has access to all file system data in the two paired recovery groups, the paired recovery group building block can run indefinitely on the active server, albeit with lower performance and no standby server should the active server fail.

When maintenance is finished on a paired recovery group server and it is restarted, the maintenance procedure can be repeated for the other server by making the restarted server the active server for both recovery groups.

For the 4 to 32 servers of scale-out recovery groups, the pdisks and file system data on each server are only available when the server is up. When a server is off-line, IBM Storage Scale RAID must use RAID fault tolerance to reconstruct any file system data stored on that server. If a server stays off-line for more than a default of 20 minutes, IBM Spectrum® Scale RAID will begin rebuilding the server's data onto spare space. Furthermore, if multiple servers are off-line at the same time, fault tolerance can be exceeded and file system data can become temporarily unavailable until the servers are brought back on-line. This makes maintenance considerations very different for scale-out servers.

Maintenance for scale-out recovery group servers must be performed using the mmvdisk recoverygroup change --suspend and mmvdisk recoverygroup change --resume commands. The suspend command safely stops IBM Storage Scale RAID on a scale-out server by checking that no other servers are suspended and that enough fault tolerance exists in the recovery group to permit taking the server off-line. A server is suspended by suspending all of its pdisks and then making it ineligible to serve log groups. The resume command brings a scale-out server back on-line by enabling it to serve log groups and resuming all of its pdisks.

Scale-out server software or hardware maintenance can safely be accomplished by sequentially:
  • Suspending one scale-out server
  • Performing the desired maintenance on that server
  • Resuming that server when maintenance is completed
  • Repeating the procedure for the next server
Note: Scale-out server maintenance should be performed quickly to avoid unnecessary RAID rebuild overhead. By default, there is a window of 20 minutes before the data on a suspended server's pdisks is rebuilt onto other nodes. The mmvdisk recoverygroup change --suspend command has a --window N option to adjust the number of minutes before the suspended server's pdisks are rebuilt. The value of N must be a number between 10 and 60 inclusive.

The mmvdisk command considers a scale-out server to be suspended if it is not eligible to serve log groups or if all of its pdisks are suspended. The mmvdisk recoverygroup list --server command will show any suspended servers. The mmvdisk recoverygroup change --suspend command will not allow more than one server to be suspended. The mmvdisk recoverygroup change --resume command can be used to resume any scale-out server, even one that is not suspended; this has the effect of starting IBM Storage Scale RAID on the server, making sure it is eligible to serve log groups, and resuming any of its pdisks that are suspended.

For more information on suspending and resuming scale-out recovery group servers, see the mmvdisk recoverygroup man page.