Monitoring AFM and AFM DR using GUI

The Files > Active File Management page in the IBM Storage Scale provides an easy way to monitor the performance, health status, and configuration aspects of the AFM and AFM DR relationships in the IBM Storage Scale cluster. It also provides details of the gateway nodes that are part of the AFM or AFM DR relationships.

The following options are available to monitor AFM and AFM DR relationships and gateway nodes:
  1. A quick view that gives the details of top relationships between cache and home sites in an AFM or AFM DR relationship. It also provides performance of gateway nodes by used memory and number of queued messages. The graphs that are displayed in the quick view are refreshed regularly. The refresh intervals are depended on the selected time frame. The following list shows the refresh intervals corresponding to each time frame:
    • Every minute for the 5 minutes time frame
    • Every 15 minutes for the 1 hour time frame
    • Every 6 hours for the 24 hours time frame
    • Every two days for the 7 days time frame
    • Every seven days for the 30 days time frame
    • Every four months for the 365 days time frame
  2. Different performance metrics and configuration details in the tabular format. The following tables are available:
    Cache
    Provides information about configuration, health, and performance of the AFM feature that is configured for data caching and replication. It also provides information on the estimated time that is needed to refresh or flush queues during recovery or resync operations.
    Disaster Recovery
    Provides information about configuration, health, and performance of AFM DR configuration in the cluster.
    Gateway Nodes
    Provides details of the nodes that are designated as the gateway node in the AFM or AFM DR configuration.

    To find an AFM or AFM DR relationship or a gateway node with extreme values, you can sort the values that are displayed on the table by different attributes. Click the performance metric in the table header to sort the data based on that metric. You can select the time range that determines the averaging of the values that are displayed in the table and the time range of the charts in the overview from the time range selector, which is placed in the upper right corner. The metrics in the table do not update automatically. The refresh button, that is placed at the top of the table, allows to refresh the table with more recent data.

  3. A detailed view of the performance and health aspects of the individual AFM or AFM DR relationship or gateway node. To see the detailed view, you can either double-click the row that lists the relationship or gateway node of which you need to view the details or select the item from the table and click View Details. The following details are available for each item:
    Cache
    • Overview: Provides number of available cache inodes and displays charts that show the amount of data that is transferred, data backlog, and memory used for the queue.
    • Events: Provides details of the system health events reported for the AFM component.
    • Snapshots: Provides details of the snapshots that are available for the AFM fileset.
    • Gateway Nodes: Provides details of the nodes that are configured as gateway node in the AFM configuration.
    Disaster Recovery
    • Overview: Provides number of available primary inodes and displays charts that show the amount of data that is transferred, data backlog, and memory used for the queue.
    • Events: Provides details of the system health events reported for the AFM component.
    • Snapshots: Provides details of the snapshots that are available for the AFM fileset.
    • Gateway Nodes: Provides details of the nodes that are configured as gateway node in the AFM configuration.
    Gateway Nodes
    The details of gateway nodes are available under the following tabs:
    • Overview tab provides performance chart for the following:
      • Client IOPS
      • Client data rate
      • Server data rate
      • Server IOPS
      • Network
      • CPU
      • Load
      • Memory
    • Events tab helps to monitor the events that are reported in the node. Similar to the Events page, you can also perform the operations like marking events as read and running fix procedure from this events view. Only current issues are shown in this view. The Monitoring > Events page displays the entire set of events that are reported in the system.
    • File Systems tab provides performance details of the file systems that are mounted on the node. File system's read or write throughput, average read or write transactions size, and file system read or write latency are also available.

      Use the Mount File System or Unmount File System options to mount or unmount individual file systems or multiple file systems on the selected node. The nodes on which the file system need to be mounted or unmounted can be selected individually from the list of nodes or based on node classes.

    • NSDs tab gives status of the disks that are attached to the node. The NSD tab appears only if the node is configured as an NSD server.
    • SMB and NFS tabs provide the performance details of the SMB and NFS services that are hosted on the node. These tabs appear in the chart only if the node is configured as a protocol node.
    • The AFM tab provides details of the configuration and status of the AFM and AFM DR relationships for which the node is configured as the gateway node.

      It also displays the number of AFM filesets and the corresponding export server maps. Each export map establishes a mapping between the gateway node and the NFS host name to allow parallel data transfers from cache to home. One gateway node can be mapped only to a single NFS server and one NFS server can be mapped to multiple gateway nodes.

    • Network tab displays the network performance details.
    • Properties tab displays the basic attributes of the node and you can use the Prevent file system mounts option to specify whether you can prevent file systems from mounting on the node.

Monitoring AFM and AFM DR configuration and performance in the remote cluster

The IBM Storage Scale GUI can monitor only a single cluster. If you want to monitor the AFM and AFM DR configuration, health, and performance across clusters, the GUI node of the local cluster must establish a connection with the GUI node of the remote cluster. By establishing a connection between GUI nodes, both the clusters can monitor each other. To enable remote monitoring capability among clusters, the GUI nodes that are communicating each other must be in the same software level.

To establish a connection with the remote cluster, perform the following steps:
  1. Perform the following steps on the local cluster to raise the access request:
    1. Select the Request Access option that is available under the Outgoing Requests tab to raise the request for access.
    2. In the Request Remote Cluster Access dialog, enter an alias for the remote cluster name and specify the GUI nodes to which the local GUI node must establish the connection.
    3. If you know the credentials of the security administrator of the remote cluster, you can also add the user name and password of the remote cluster administrator and skip step 2 .
    4. Click Send to submit the request.
  2. Perform the following steps on the remote cluster to grant access:
    1. When the request for connection is received in, the GUI displays the details of the request in the Access > Remote Connections > Incoming Requests page.
    2. Select Grant Access to grant the permission and establish the connection.
Now, the requesting cluster GUI can monitor the remote cluster. To enable both clusters to monitor each other, repeat the procedure with reversed roles through the respective GUIs.
Note: Only the GUI user with Security Administrator role can grant access to the remote connection requests.

When the remote cluster monitoring capabilities are enabled, you can view the following remote cluster details in the local AFM GUI:

  • On home and secondary, you can see the AFM relationships configuration, health status, and performance values of the Cache and Disaster Recovery grids.
  • On the Overview tab of the detailed view, the available home and secondary inodes are available.
  • On the Overview tab of the detailed view, the details such as NFS throughput, IOPs, and latency details are available, if the protocol is NFS.

The performance and status information on gateway nodes are not transferred to home.

Creating and deleting peer and RPO snapshots through GUI

When a peer snapshot is taken, it creates a snapshot of the cache fileset and then queues a snapshot creation at the home site. This ensures application consistency at both cache and home sites. The recovery point objective (RPO) snapshot is a type of peer snapshot that is used in the AFM DR setup. It is used to maintain consistency between the primary and secondary sites in an AFM DR configuration.

Use the Create Peer Snapshot option in the Files > Snapshots page to create peer snapshots. You can view and delete these peer snapshots from the Snapshots page and also from the detailed view of the Files > Active File Management page.