Changing home of AFM cache

AFM filesets continue to function in the event of home failures.

AFM filesets serve the cache applications with cached data. If the home is permanently lost, you can create a new home. An existing SW/IW AFM cache can make the following changes to the home. This is not supported on RO/LU filesets.
  1. Replace the home with a completely new empty home where the new target is created using NFS/NSD mapping:

    The administrator must run the mmafmctl failover command without –target-only option to point to the new home. The new home is expected to be empty. To ensure that extended attributes are synchronized, run the mmafmconfig command on the new home before running the failover command. If the new target is a mapping, failover does not split data transfers and queues them as normal requests without parallel data transfer.

  2. Replace any of the following on an existing home:
    1. Replace the communication protocol (NSD, NFS): The administrator must run the mmafmctl failover command without the –target-only option, using the new target protocol.
      Note: Only the protocol changes. The home path does not change.
    2. Enable or disable parallel data transfer by shifting between NFS and a mapping: The administrator must run the mmafmctl failover command without the –target-only option, using the new target protocol or mapping.
      Note: Only the protocol changes. The home path does not change.
    3. Replace either the IP address or the NFS server using the same communication protocol and home path: The administrator must run the mmafmctl failover command with the –target-only option. The IP/NFS server must be on the same home cluster and must be of the same architecture as the old NFS server.
      Note: Only the IP or NFS server changes.
During failover ensure that the cache filesystem is mounted on all gateway nodes, and the new home filesystem is mounted on all the nodes in the home cluster.
Note: Failover does not use parallel data transfers.

When creating a new home, all cached data and metadata available in the cache are queued in the priority queue to the new home during the failover process. The failover process is not synchronous and completes in the background. The afmManualResyncComplete callback event is triggered when failover is complete. Resync does not split data transfers even if parallel data transfer is configured, and the target is a mapping.

If the failover is interrupted due to a gateway node failure or quorum loss, failover is restarted automatically when the cache fileset attempts to go to Active state.

When there are multiple IW caches, the administrator must choose a primary IW cache and fail this over to a new empty home. All of the other IW cache filesets to the old home must be deleted and recreated. If another IW cache is failed over to the same home after failing over another IW cache, all the data in that cache overwrites existing objects at home. For this reason, failover to a non-empty home is discouraged.

When the failover function is likely to be used due to the failure of the old home, the admin must be cautious and disable automatic eviction. This is to ensure that eviction does not free the cached data on the cache. The evicted data cannot be recovered if the old home is lost.

Each AFM fileset is independently managed and has a one-to-one relationship with a target, thus allowing different protocol backends to coexist on separate filesets in the same file system. However, AFM does not validate the target for correctness when a fileset is created. The user must specify a valid target. Do not use a target that belongs to the same file system as the AFM fileset. For more information, see mmafmctl command.

The following example shows changing the target for an SW fileset. Consider a SW fileset fileset_SW of file system fs1 that uses home target nfs://node4/gpfs/fshome/fset001. A failover is performed to a new target nfs://node4/gpfs/fshome/fset002new.

# mmlsfileset fs1 fileset_SW --afm

The system displays output similar to -

Filesets in file system 'fs1':
Name Status Path afmTarget
fileset_SW Linked /gpfs/cache/fileset_SW nfs://node4/gpfs/fshome/fset001


#mmafmctl fs1 getstate -j fileset_SW

The system displays output similar to -

Fileset Name Fileset Target Cache State Gateway Node Queue Length Queue numExec
------------ -------------- ------------- ------------ -------------------------
fileset_SW nfs://node4/gpfs/fshome/fset001 Active GatewayNode1 0 6


# mmafmctl fs1 failover -j fileset_SW --new-target nfs://node4/gpfs/fshome/fset002new

The system displays output similar to -

mmafmctl:Performing failover to nfs://node4/gpfs/fshome/fset002new
Fileset fileset_SW changed.
mmafmctl: Failover in progress. This may take while...
Check fileset state or register for callback to know the completion status.


# mmafmctl fs1 getstate -j fileset_SW

The system displays output similar to -

Fileset Name Fileset Target Cache State Gateway Node Queue Length Queue numExec
------------ -------------- ------------- ------------ -------------------------
fileset_SW nfs://node4/gpfs/fshome/fset002new NeedsResync GatewayNode1 6 0


# mmafmctl fs1 getstate -j fileset_SW

The system displays output similar to -

Fileset Name Fileset Target Cache State Gateway Node Queue Length Queue numExec
------------ -------------- ------------- ------------ -------------------------
fileset_SW nfs://node4/gpfs/fshome/fset002new Recovery GatewayNode1 6 0


# mmafmctl fs1 getstate -j fileset_SW

The system displays output similar to -

Fileset Name Fileset Target Cache State Gateway Node Queue Length Queue numExec
------------ -------------- ------------- ------------ -------------------------
fileset_SW nfs://node4/gpfs/fshome/fset002new Active GatewayNode1 0 6 0


Note: After failover, the fileset changes to states such as NeedsResync or Recovery. Depending on the size or the elapsed time, the fileset remains in these transition states before turning into the Active state. The failover process is complete after the fileset is in Active state.


# mmlsfileset fs1 --afm

The system displays output similar to -

Filesets in file system 'fs1':
Name Status Path afmTarget
root Linked /gpfs/cache --
fileset_SW Linked /gpfs/cache/fileset_SW nfs://node4/gpfs/fshome/fset002new

Out-band Failover - You can choose to copy all cached data offline from the AFM cache to the new home with any tool that preserves modification time (mtime) with nanoseconds granularity. An example of such a tool is - rsync version 3.1.0 or later with protocol version 31. After the data is copied, you can run mmafmctl failover to compare mtime and filesize at home, and avoid queuing unnecessary data to home.