Reduced recovery time by using Persistent Reserve
Persistent Reserve (PR) provides a mechanism for reducing recovery times from node failures.
- All disks must be PR-capable. For a list of devices supported by PR, see this question in IBM Storage Scale FAQ in IBM® Documentation.
- On AIX®, all disks must be hdisks. Starting with 3.5.0.16,
it is also possible to use a logical volume as a descOnly disk without
disabling the use of Persistent Reserve. For more information, see the IBM Storage Scale FAQ in IBM Documentation.
On Linux®, typically all disks must be generic (/dev/sd*) or DM-MP (/dev/dm-*) disks.
However, for Linux on Z, multipath device names are required for SCSI disks, and the names depend on the distribution and are configurable. For more information, see Guarding against loss of data availability due to path failure.
- If the disks have NSD servers that are defined, all NSD server nodes must be running the same operating system (AIX or Linux).
- If the disks are SAN-attached to all nodes, all nodes in the cluster must be running the same operating system (AIX or Linux).
For quicker recovery times when you use PR, set the failureDetectionTime
configuration parameter on the mmchconfig command. For example, for quick
recovery a recommended value would be 10: mmchconfig failureDetectionTime=10
Explicitly enable PR by specifying the usePersistentReserve parameter on the
mmchconfig command. If you set usePersistentReserve=yes
, GPFS attempts to set up PR on all of the PR
capable disks. All subsequent NSDs are created with PR enabled if they are PR capable. However, PR
is only supported in the home cluster. Therefore, access to PR-enabled disks from another cluster
must be through an NSD server that is in the home cluster and not directly to the disk (for example,
through a SAN).