Monitoring fileset states for AFM DR

AFM DR fileset can have different states depending on the mode and queue states.

Run the mmafmctl getstate command to view the current cache state.

See the following table:

Table 1. AFM DR states and their description
AFM fileset state	Condition	Description	Healthy or Unhealthy	Administrator's action
Inactive	AFM primary is created	Operations have not been initiated on the primary after last daemon restart.	Healthy	None
FlushOnly	Operations are queued	Operations have not started to flush. This is a temporary state and moves to Active when a write is initiated.	Healthy
Active	AFM primary is active	Primary is ready for operation	Healthy	None
Dirty	AFM primary is active	Indicates there are pending changes in primary not yet played at secondary. Does not hamper normal activity.	Healthy	None
Recovery	The primary is accessed after MDS failure	Can occur when a new gateway is taking over a fileset as MDS after the old MDS failed.	Healthy	None
QueueOnly	The primary is running some operation	Can occur when operations such as recovery are being executed and operations are being queued and are not yet flushed.	Healthy	This is a temporary state.
Disconnected	It occurs when the MDS cannot connect to the NFS server at secondary	Occurs only in a cache cluster that is created over NFS export. When parallel I/O is configured, this state shows the connectivity between the MDS and the mapped home server, irrespective of other gateway nodes.	Unhealthy	Correct the errant NFS servers on the secondary cluster.
Unmounted	Primary using NFS detects a change in secondary - sometimes during creation or in the middle of operation if secondary exports are interfered	This can occur if: Secondary NFS is not accessible Secondary exports are not exported properly Secondary export does not exist	Unhealthy	Rectify the NFS export issue as in secondary setup section and retry access Relink primary if it does not recover. After mountRetryInterval of the MDS, the primary retries connecting with secondary
Unmounted	The primary that is using the GPFS™ protocol detects a change in the secondary cluster, sometimes during creation or in the middle of an operation	Occurs when there are problems accessing the local mount of the remote file system.	Unhealthy	Check remote filesystem mount on the primary cluster and remount if necessary.
Dropped	Recovery failed.	Occurs when the local file system is full, space is not available on the primary, or a policy failure during recovery.	Unhealthy	Fix the issue and access the fileset to retry recovery.
Dropped	A primary with active queue operations is forcibly unlinked	All queued operations are being de-queued, and the fileset remains in the Dropped state and moves to the Inactive state when the unlinking is complete.	Healthy	This is a temporary state.
Dropped	Old GW node starts functioning properly after a failure	AFM internally performs queue transfers from one gateway to another to handle gateway node failures.	Healthy	The system resolves this state on the next access.
Dropped	Primary creation or in the middle of an operation if the home exports changed.	Export problems at secondary such as: The home path is not exported on all NFS server nodes that are interacting with the cache clusters. Even if the home cluster is exported after the operations have started on the fileset, problems might persist. Changing `fsid` on the home cluster after the fileset operations have begun. All home nfs servers do not have the same `fsid` for the same export path.	Unhealthy	Fix the NFS export issue in the secondary setup section and retry for access. Relink the primary if the cache cluster does not recover. After mountRetryInterval the MDS retries connecting with the secondary.
Dropped	During recovery or normal operation	If gateway queue memory is exceeded, the queue can get dropped. The memory has to be increased to accommodate all requests and bring the queue back to the Active state.	Unhealthy	Increase afmHardMemThreshold.
NeedsResync	Recovery on primary	This is a rare state and is possible only under error conditions during recovery.	Unhealthy	The problem gets fixed automatically in the subsequent recovery.
NeedsResync	Failback on primary or conversion from GPFS/SW to primary	This is a rare state and is possible only under error conditions during failback or conversion.	Unhealthy	Rerun failback or conversion.
PrimInitProg	Setting up primary and secondary relationship during - creation of a primary fileset. conversion of gpfs, sw, or iw fileset to primary fileset. change secondary of a primary fileset.	This state is used while primary and secondary are in the process of establishing a relationship while the psnap0 is in progress. All operations are disallowed till psnap0 is taken locally. This should move to active when psnap0 is queued and played on the secondary side.	Healthy	Review errors on psnap0 failure if fileset state is not active.
PrimInitFail	Failed to set up primary and secondary relationship during - creation of a primary fileset. conversion of gpfs, sw, or iw fileset to primary fileset. change secondary of a primary fileset.	This is a rare failure state when the psnap0 has not been created at the primary. In this state no data is moved from the primary to the secondary. The administrator should check that the gateway nodes are up and file system is mounted on them on the primary. The secondary fileset should also be setup correctly and available for use.	Unhealthy	Review errors after psnap0 failure. Re-running the mmafmctl convertToPrimary command without any parameters ends this state.
FailbackInProgress	Primary failback started	This is the state when failback is initiated on the primary.	Healthy	None