AFM events
The following table lists the events that are created for the AFM component.
Event | Event Type |
Severity | Call Home | Details |
---|---|---|---|---|
afm_cache_disconnected | STATE_CHANGE | WARNING | no | Message: Fileset {0} is disconnected. |
Description: The AFM cache fileset is not connected to its home server. | ||||
Cause: Shows that the connectivity between the AFM Gateway and the mapped home server is lost. | ||||
User Action: The user action is based on the kind of the disconnection that exists. Check the settings on both home and cache sites and correct the connectivity issues. The state automatically changes to ACTIVE state after solving the issues. | ||||
afm_cache_dropped | STATE_CHANGEIBM Storage Scale | ERROR | no | Message: Fileset {0} is in the DROPPED state. |
Description: The AFM cache fileset state moves to the DROPPED state. | ||||
Cause: An AFM cache fileset state moves to dropped due to different reasons, such as recovery failures or failback failures, etc. | ||||
User Action: There are many reasons that can cause the cache to go to DROPPED state. For more information, see the Monitoring fileset states for AFM (DR) section in the IBM Storage Scale: Problem Determination Guide. | ||||
afm_cache_expired | INFO | ERROR | no | Message: Fileset {0} in {1} mode is now in the EXPIRED state. |
Description: Cache contents are no longer accessible due to expiration of time. | ||||
Cause: Cache contents are no longer accessible due to expiration of time. | ||||
User Action: Check the network connectivity to the home server as well as the home server availability. | ||||
afm_cache_inactive | STATE_CHANGE | INFO | no | Message: The AFM cache fileset {0} is in the INACTIVE state. |
Description: The AFM fileset is in the INACTIVE state until initial operations on the fileset are triggered by the user. | ||||
Cause: N/A | ||||
User Action: N/A | ||||
afm_cache_recovery | STATE_CHANGE | WARNING | no | Message: The AFM cache fileset {0} in {1} mode is in the RECOVERY state. |
Description: In this state, the AFM cache fileset recovers from a previous failure and identifies changes that need to be synchronized to its home server. | ||||
Cause: A previous failure triggered a cache recovery. | ||||
User Action: This state automatically changes back to ACTIVE when the recovery is finished. | ||||
afm_cache_stopped | STATE_CHANGE | WARNING | no | Message: The AFM fileset {0} is stopped. |
Description: The AFM cache fileset is stopped. | ||||
Cause: The AFM cache fileset is in the Stopped state. | ||||
User Action: Run the mmafmctl restart command to continue operations on the fileset. | ||||
afm_cache_suspended | STATE_CHANGE | WARNING | no | Message: AFM fileset {0} is suspended. |
Description: The AFM cache fileset is suspended. | ||||
Cause: The AFM cache fileset is in the Suspended state. | ||||
User Action: Run the mmafmctl resume command to resume operations on the fileset. | ||||
afm_cache_unmounted | STATE_CHANGE | ERROR | no | Message: The AFM cache fileset {0} is in unmounted state. |
Description: The AFM cache fileset is in an Unmounted state because of issues on the home site. | ||||
Cause: The AFM cache fileset is in this state when either the home server's NFS-mount is not accessible, home server's exports are not exported properly, or home server's export does not exist. | ||||
User Action: Resolve issues on the home server's site. Afterwards, this state changes automatically. | ||||
afm_cache_up | STATE_CHANGE | INFO | no | Message: An 'Active' or 'Dirty' status is expected in the mmdiag --afm command output, and the output shows that the cache is in a HEALTHY state. |
Description: The AFM cache is up and ready for operations. | ||||
Cause: N/A | ||||
User Action: N/A | ||||
afm_cmd_requeued | STATE_CHANGE | WARNING | no | Message: Messages are requeued on the AFM fileset {0}. Details: {1}. |
Description: Triggered during replication when messages are queued up again because of errors. These messages are retried after 15 minutes. | ||||
Cause: Callback afmCmdRequeued is being processed. | ||||
User Action: It is usually a transient state. Track this event. If the problem remains, then contact IBM Support. | ||||
afm_event_connected | STATE_CHANGE | INFO | no | Message: The AFM node {0} has regained connection to the home site. Details: {1}. |
Description: Triggered when a gateway node connects to the afmTarget of the fileset that it is serving. | ||||
Cause: N/A | ||||
User Action: N/A | ||||
afm_event_disconnected | STATE_CHANGE | ERROR | no | Message: The AFM node {0} has lost connection to the home site. Fileset {1}. Details: {2}. |
Description: Triggered when a gateway node gets disconnected from the afmTarget of the fileset that it is serving. | ||||
Cause: Callback afmHomeDisconnected is being processed. | ||||
User Action: Check the network connectivity to the home server as well as the home server availability. | ||||
afm_failback_complete | STATE_CHANGE | WARNING | no | Message: The AFM cache fileset {0} in {1} mode is in the
'FailbackCompleted ' state. |
Description: The independent writer failback is finished. | ||||
Cause: The independent writer failback is finished and needs further user actions. | ||||
User Action: The administrator must run the mmafmctl failback --stop command to move the IW cache to the ACTIVE state. | ||||
afm_failback_needed | STATE_CHANGE | ERROR | no | Message: The AFM cache fileset {0} in {1} mode is in the NeedFailback state. |
Description: A previous failback operation could not be completed and needs to be re-run. | ||||
Cause: This state is reached when a previously initialized failback was interrupted and not completed. | ||||
User Action: Failback automatically gets triggered on the fileset. The administrator can manually re-run a failback by using the mmafmctl failback command. | ||||
afm_failback_running | STATE_CHANGE | WARNING | no | Message: The AFM cache fileset {0} in {1} mode is in the FailbackInProgress state. |
Description: A failback process on the independent writer cache is in- progress. | ||||
Cause: A failback process has been initiated on the independent writer cache and is in-progress. | ||||
User Action: No user action is needed at this point. After completion, the state automatically changes to the FailbackCompleted state. | ||||
afm_failover_running | STATE_CHANGE | WARNING | no | Message: The AFM cache fileset {0} is in FailoverInProgress state. |
Description: The AFM cache fileset is in the middle of a failover process. | ||||
Cause: The AFM cache fileset is in the middle of a failover process. | ||||
User Action: No user action is needed at this point. The cache state is moved automatically to the ACTIVE state when the failover is completed. | ||||
afm_fileset_changed | INFO | INFO | no | Message: AFM fileset {0} is changed. |
Description: An AFM fileset is changed. | ||||
Cause: N/A | ||||
User Action: N/A | ||||
afm_fileset_created | INFO | INFO | no | Message: AFM fileset {0} is created. |
Description: An AFM fileset is created. | ||||
Cause: N/A | ||||
User Action: N/A | ||||
afm_fileset_deleted | INFO | INFO | no | Message: AFM fileset {0} is deleted. |
Description: An AFM fileset is deleted. | ||||
Cause: N/A | ||||
User Action: N/A | ||||
afm_fileset_expired | INFO | WARNING | no | Message: The contents of the AFM cache fileset {0} are expired. |
Description: The AFM cache fileset contents are expired. | ||||
Cause: The contents of a fileset expire either as a result of the fileset being disconnected for the expiration timeout value or when the fileset is marked as expired using the AFM administration commands. This event is triggered through an AFM callback. | ||||
User Action: Check why the fileset is disconnected to refresh the contents. | ||||
afm_fileset_found | INFO_ADD_ENTITY | INFO | no | Message: The AFM fileset {0} was found. |
Description: An AFM fileset was detected. | ||||
Cause: An AFM fileset was detected through the appearance of the fileset in the mmdiag --afm command output. | ||||
User Action: N/A | ||||
afm_fileset_linked | INFO | INFO | no | Message: AFM fileset {0} is linked. |
Description: An AFM fileset is linked. | ||||
Cause: N/A | ||||
User Action: N/A | ||||
afm_fileset_unexpired | INFO | INFO | no | Message: The contents of the AFM cache fileset {0} are unexpired. |
Description: The contents of the AFM cache filesets did not expire, and available for operations. This event is triggered when the home is reconnected, and cache contents are available, or the administrator runs the mmafmctl unexpire command on the cache fileset. This event is triggered through an AFM callback. | ||||
Cause: N/A | ||||
User Action: N/A | ||||
afm_fileset_unlinked | INFO | INFO | no | Message: AFM fileset {0} is unlinked. |
Description: An AFM fileset is unlinked. | ||||
Cause: N/A | ||||
User Action: N/A | ||||
afm_fileset_unmounted | STATE_CHANGE | ERROR | no | Message: The AFM fileset {0} was unmounted because the remote side is not reachable. Details: {1}. |
Description: Triggered when the fileset is moved to an Unmounted state because NFS server is not reachable or remote cluster mount is not available for GPFS Native protocol. | ||||
Cause: After 300 seconds, the cache retries to connect to home, and it moves to the Active state. If AFM is using the native GPFS protocol as target, the cache state is moved to the Unmounted state because the local mount of the remote file system is not accessible. | ||||
User Action: Remount the remote file system on the local cache cluster. | ||||
afm_fileset_vanished | INFO_DELETE_ENTITY | INFO | no | Message: The AFM fileset {0} has vanished. |
Description: An AFM fileset is not in use anymore. | ||||
Cause: The AFM fileset is not in use anymore. This is detected through the absence of the fileset in the mmdiag --afm command output. | ||||
User Action: N/A | ||||
afm_flush_only | STATE_CHANGE | INFO | no | Message: The AFM cache fileset {0} is in the FlushOnly state. |
Description: Indicates that operations are queued, but have not started to flush to the home server. | ||||
Cause: N/A | ||||
User Action: N/A | ||||
afm_home_connected | STATE_CHANGE | INFO | no | Message: The AFM fileset {0} has regained connection to the home site. Details: {1}. |
Description: Callback afmHomeConnected is being processed. This is a healthy state. | ||||
Cause: N/A | ||||
User Action: N/A | ||||
afm_home_disconnected | STATE_CHANGE | ERROR | no | Message: The AFM fileset {0} has lost connection to the home site. Details: {1}. |
Description: Triggered when a gateway node gets disconnected from the afmTarget of the fileset that it is serving. | ||||
Cause: Callback afmHomeDisconnected is being processed. | ||||
User Action: Check the network connectivity to the home server as well as the home server availability. | ||||
afm_pconflicts_empty | STATE_CHANGE | INFO | no | Message: .pconflicts is healthy. |
Description: Clear TIPS events from AFM. The .pconflict directory is clean. | ||||
Cause: N/A | ||||
User Action: N/A | ||||
afm_pconflicts_storage | TIP | TIP | no | Message: The fileset {0} .pconflicts directory contains user data. Examine and remove unused files to free storage. |
Description: AFM detected conflicting file changes and stored the conflicting file(s) in .pconflict directory. | ||||
Cause: .pconflicts directory is not empty. | ||||
User Action: Analyze contents of .pconflicts directories to free up storage. | ||||
afm_prim_init_fail | STATE_CHANGE | ERROR | no | Message: The AFM cache fileset {0} is in the PrimInitFail state. |
Description: The AFM cache fileset is in the PrimInitFail state. No data is moved from the primary to the secondary fileset. | ||||
Cause: This rare state appears if the initial creation of psnap0 on the primary cache fileset failed. | ||||
User Action: Check whether the fileset is available and exported to be used as primary. The gateway node should be able to access this mount and the primary ID should be setup on the secondary gateway. You may try running the mmafmctl converToPrimary command on the primary fileset again. | ||||
afm_prim_init_running | STATE_CHANGE | WARNING | no | Message: The AFM primary cache fileset {0} is in the PrimInitProg state. |
Description: The AFM cache fileset is synchronizing psnap0 with its secondary AFM cache fileset. | ||||
Cause: This AFM cache fileset is a primary fileset and synchronizing the content of psnap0 to the secondary AFM cache fileset. | ||||
User Action: This state changes back to 'Active' automatically when the synchronization is finished. | ||||
afm_ptrash_empty | STATE_CHANGE | INFO | no | Message: .ptrash is healthy. |
Description: Clear TIPS events from AFM. The .ptrash directory is clean. | ||||
Cause: N/A | ||||
User Action: N/A | ||||
afm_ptrash_storage | TIP | TIP | no | Message: The fileset {0} .ptrash directory contains user data. Examine and remove unused files to free storage. |
Description: .ptrash directory is not empty. | ||||
Cause: .ptrash directory is not empty. | ||||
User Action: Analyze contents of .ptrash directories to free up storage. | ||||
afm_queue_dropped | STATE_CHANGE | ERROR | no | Message: The AFM cache fileset {0} encountered an error synchronizing with its remote cluster. Details: {1}. |
Description: The AFM cache fileset encountered an error synchronizing with its remote cluster. It cannot synchronize with the remote cluster until AFM recovery is executed. | ||||
Cause: This event occurs when a queue is dropped on the gateway node. | ||||
User Action: Initiate IO to trigger recovery on this fileset. | ||||
afm_queue_only | STATE_CHANGE | INFO | no | Message: The changes of AFM cache fileset {0} in {1} mode are not flushed yet to home. |
Description: This state is applicable for SW/IW caches. A cache fileset is moved to QueueOnly when operations at the cache are queued but not yet flushed. This can happen in states such as recovery, resync, failover when the queue is in the process of getting flushed to home. This state is temporary and the user can continue normal activity. | ||||
Cause: N/A | ||||
User Action: N/A | ||||
afm_recovery_failed | STATE_CHANGE | ERROR | no | Message: AFM recovery on fileset {0} failed with error {1}. |
Description: AFM recovery has failed. | ||||
Cause: AFM recovery has failed. | ||||
User Action: Recovery is retried on next access after the recovery retry interval. Alternatively, you can manually resolve known problems and recover the fileset. | ||||
afm_recovery_finished | STATE_CHANGE | INFO | no | Message: A recovery process ended for the AFM cache fileset {0}. |
Description: A recovery process has ended on this AFM fileset. | ||||
Cause: N/A | ||||
User Action: N/A | ||||
afm_recovery_running | STATE_CHANGE | WARNING | no | Message: AFM fileset {0} is triggered for recovery start. |
Description: A recovery process was started on this AFM cache fileset. | ||||
Cause: A recovery process was started on this AFM cache fileset. | ||||
User Action: The cache fileset state moves to the healthy state when recovery is complete. Monitor this event. | ||||
afm_resync_needed | STATE_CHANGE | WARNING | no | Message: The AFM cache fileset {0} in {1} mode is in the NeedsResync state. |
Description: The AFM cache fileset detects some accidental corruption of data on the home server. | ||||
Cause: The AFM cache fileset detects some accidental corruption of data on the home server. | ||||
User Action: Run the mmafmctl resync command to trigger a resync. The fileset moves automatically to the ACTIVE state afterward. | ||||
afm_rpo_miss | STATE_CHANGE_EXTERNAL | WARNING | no | Message: The AFM recovery point objective (RPO) is missed for {id}. |
Description: The primary fileset is triggering an RPO snapshot which is expected to complete within a specified interval (RPO). This time interval is exceeded. | ||||
Cause: The callback afmRPOMiss was triggered due to the network delay, too much data to replicate, or an error during snapshot creation. | ||||
User Action: Ensure that the network connectivity is sufficient to transfer all changes within the RPO interval and check that the Home and Cache sites are operating correctly. If there are no issues, use the mmhealth event resolve <fileset identifier> command to manually clear this health event. | ||||
afm_rpo_sync | STATE_CHANGE_EXTERNAL | INFO | no | Message: An AFM RPO snapshot is completed {id}. |
Description: The primary fileset is triggering an RPO snapshot at a specified interval. The snapshot is completed successfully. | ||||
Cause: The callback afmDRRPOSync was triggered. | ||||
User Action: N/A | ||||
afm_sensors_active | TIP | INFO | no | Message: The AFM perfmon sensor {0} is active. |
Description: The AFM perfmon sensors are active. This event's monitor is running only once an hour. | ||||
Cause: The value of the AFM perfmon sensors' period attribute is greater than 0. | ||||
User Action: N/A | ||||
afm_sensors_inactive | TIP | TIP | no | Message: The following AFM perfmon sensor {0} is inactive. |
Description: The AFM perfmon sensors are inactive. This event's monitor is running only once an hour. | ||||
Cause: The value of the AFM perfmon sensors' period attribute is 0. | ||||
User Action: Set the period attribute of the AFM sensors greater than 0. Therefore, run the mmperfmon config update SensorName.period=N command, where SensorName is one of the AFM sensors' name and N is a natural number greater 0. On the other hand, you can hide this event by using the mmhealth event hide afm_sensors_inactive command. | ||||
afm_sensors_not_configured | TIP | TIP | no | Message: The AFM perfmon sensor {0} is not configured. |
Description: The AFM perfmon sensor does not exist in the mmperfmon config show command output. | ||||
Cause: The AFM perfmon sensor is not configured in the sensors' configuration file. | ||||
User Action: Include the sensors into the perfmon configuration by using the mmperfmon config add --sensors SensorFile command. An example for the configuration file can be found in the mmperfmon command page. |