Changing home of AFM cache

AFM filesets continue to function though failures are occurred on a home.

AFM filesets serve the cache applications with cached data. If the home is permanently lost, you can create a new home. An existing SW/IW AFM cache can make the following changes to the home. These changes are not supported on RO/LU filesets.
  1. Replace the home with a new empty home where the new target is created by using the NFS or NSD mapping:

    The administrator must run the mmafmctl failover command without –-target-only option to point to the new home. The new home is expected to be empty. To ensure that extended attributes are synchronized, run the mmafmconfig command on the new home before you run the failover command. If the new target is a mapping, failover does not split data transfers and queues them as normal requests without parallel data transfer.

  2. Replace any of the following on an existing home:
    1. Replace the communication protocol (NSD, NFS): The administrator must run the mmafmctl failover command without the –-target-only option, by using the new target protocol.
      Note: Only the protocol changes. The home path does not change.
    2. Enable or disable parallel data transfer by shifting between NFS and a mapping: The administrator must run the mmafmctl failover command without the –-target-only option, by using the new target protocol or mapping.
      Note: Only the protocol changes. The home path does not change.
    3. Replace either the IP address or the NFS server by using the same communication protocol and home path: The administrator must run the mmafmctl failover command with the –-target-only option. The IP/NFS server must be on the same home cluster and must be of the same architecture as the old NFS server.
      Note: Only the IP or NFS server changes.
During failover, ensure that the cache file system is mounted on all gateway nodes, and the new home file system is mounted on all the nodes in the home cluster.
Note: Failover does not use parallel data transfers.

When you create a new home, all cached data and metadata available in the cache are queued in the priority queue to the new home during the failover process. The failover process is not synchronous and completes in the background. The afmManualResyncComplete callback event is triggered when failover is complete. Resync does not split data transfers even if parallel data transfer is configured, and the target is a mapping.

If the failover is interrupted due to a gateway node failure or quorum loss, failover is restarted automatically when the cache fileset attempts to go to Active state.

When a cache has multiple IW filesets, the administrator must choose a primary IW cache and fail this cache over to a new empty home. All of the other IW cache filesets to the old home must be deleted and re-created. If another IW cache is failed over to the same home after it is failed over to another IW cache, all the data in that cache overwrites existing objects at home. For this reason, fail over to a non-empty home is discouraged.

When the failover function is likely to be used due to the failure of the old home, the admin must be cautious and disable automatic eviction. You need to ensure that the eviction does not free the cached data on the cache. The evicted data cannot be recovered if the old home is lost.

Each AFM fileset is independently managed and has a one-to-one relationship with a target, thus allowing different protocol backends to coexist on separate filesets in the same file system. However, AFM does not validate the target for correctness when a fileset is created. The user must specify a valid target. Do not use a target that belongs to the same file system as the AFM fileset. For more information, see mmafmctl command.

The following example shows changing the target for an SW fileset. Consider a fileset_SW SW fileset of a fs1 file system that uses the nfs://node4/gpfs/fshome/fset001 home target. A failover is performed to a new target nfs://node4/gpfs/fshome/fset002new.

# mmlsfileset fs1 fileset_SW --afm
A sample output is as follows:
Filesets in file system 'fs1':
Name         Status    Path                    afmTarget
-----------  --------  ---------------------   ------------------------
fileset_SW   Linked    /gpfs/cache/fileset_SW  nfs://node4/gpfs/fshome/fset001 
# mmafmctl fs1 getstate -j fileset_SW
A sample output is as follows:
Fileset Name Fileset Target                  Cache State   Gateway Node  Queue Length   Queue numExec
------------ --------------                  ------------- ------------  -------------  ------------
fileset_SW   nfs://node4/gpfs/fshome/fset001 Active        GatewayNode1   0             6 
# mmafmctl fs1 failover -j fileset_SW --new-target nfs://node4/gpfs/fshome/fset002new
A sample output is as follows:
mmafmctl:Performing failover to nfs://node4/gpfs/fshome/fset002new
Fileset fileset_SW changed.
mmafmctl: Failover in progress. This may take while...
Check fileset state or register for callback to know the completion status. 
# mmafmctl fs1 getstate -j fileset_SW
A sample output is as follows:
Fileset Name Fileset Target                     Cache State    Gateway Node  Queue Length   Queue numExec
------------ --------------                     -------------  ------------  -------------  -------------
fileset_SW   nfs://node4/gpfs/fshome/fset002new NeedsResync    GatewayNode1  6              0 
# mmafmctl fs1 getstate -j fileset_SW
A sample output is as follows:
Fileset Name Fileset Target                     Cache State   Gateway Node Queue Length  Queue numExec
------------ --------------                     ------------- ------------ ------------- ------------
fileset_SW   nfs://node4/gpfs/fshome/fset002new Recovery      GatewayNode1  6            0 
# mmafmctl fs1 getstate -j fileset_SW
A sample output is as follows:
Fileset Name Fileset Target                     Cache State   Gateway Node  Queue Length  Queue numExec 
------------ --------------                     ------------- ------------  ------------- ------------
fileset_SW   nfs://node4/gpfs/fshome/fset002new Active        GatewayNode1  0 6            0 
Note: After failover, the fileset changes to states such as NeedsResync or Recovery. Depending on the size or the elapsed time, the fileset remains in these transition states before the fileset turns into the Active state. The failover process is complete after the fileset is in Active state.
# mmlsfileset fs1 fileset_SW --afm
A sample output is as follows:
Filesets in file system 'fs1': 
Fileset Name Status      Path                  afmTarget 
------------ ---------  ------------           ------------
fileset_SW   Linked   /gpfs/cache/fileset_SW    nfs://node4/gpfs/fshome/fset002new 

Out-band Failover - You can choose to copy all cached data offline from the AFM cache to the new home with any tool that preserves modification time (mtime) with nanoseconds granularity. An example of such a tool is - rsync version 3.1.0 or later with protocol version 31. After the data is copied, you can run the mmafmctl failover command to compare mtime and filesize at home, and avoid queuing unnecessary data to home.