Repairing data block replica mismatches with the global replica selection rule

Follow this procedure if the data block replica mismatches are due to one or more bad disks.

You can confirm if one or more disks are bad by looking at the frequency of disks that contain mismatched replicas in the online replica compare operation output. Follow the steps to exclude and repair the data block replica mismatches.

  1. To exclude the bad disks from being read, run the following command:
    mmchconfig diskReadExclusionList=<nsd1;nsd2;...> -i
    Setting this configuration option prevents the read of data blocks from the specified disks when the disks have one of the following statuses: ready, suspended, or replacement. If all of the replicas of a data block are on read-excluded disks, then the data block is fetched from the disk that was specified earlier in the diskReadExclusionList.
    Note: Setting this configuration option does not invalidate existing caches. So if a block of a read-excluded disk is already cached, then the cached version is returned on block read. Writes to the excluded disks are not blocked when the disks are available.

    This configuration option works by marking the in-memory disk data structure with a flag. The status and availability of such disks are preserved without any disk configuration changes.

    This configuration option can be enabled and disabled dynamically, so you do not need to restart the GPFS daemon.

  2. Validate the files that are reported by the online replica compare operation by processing them through their associated application. If the files can be read correctly, then the replica mismatches can be repaired. Otherwise, adjust the diskReadExclusionList.
  3. To repair the replica mismatches, run the following command:
    mmrestripefile -c --inode-number <SnapPath/InodeNumber>
    Where SnapPath is the path to the snapshot root directory, which contains the InodeNumber with replica mismatches. If the replica mismatch is for a file in the active file system, then SnapPath would be the path of the root directory of the active file system. For example:
    mmrestripefile -c --inode-number /gpfs/fs1/.snapshots/snap2/11138
    mmrestripefile -c --inode-number /gpfs/fs1/11138
    Run this command on each of the inodes that were reported by the earlier online replica compare operation.
  4. To disable the diskReadExclusionList configuration option, run the following command:
    mmchconfig diskReadExclusionList=DEFAULT -i

This method provides a fast way to exclude data block reads from disks with stale data. To exercise more granular control over which data block replicas are read per file, see Repairing data block replica mismatches with the file level replica selection rule.