AFM resync version 2
The AFM resync version 2 enhances the replication performance. This feature is suitable for a cluster that is employed for heavy stress and workloads. It uses an on-demand dependency resolution for queued messages, in case of resync and/or recovery is running. The message queuing performance is increased by using the runtime filtering or the dependency resolution. Thus, less memory is used on the gateway nodes and queued messages are replicated quickly. This feature also helps in the role reversal of AFM-DR filesets.
An AFM and AFM-DR primary fileset creates a file operation queue on the designated gateway nodes for a replication. These operations are run on the target or home cluster asynchronously based on the setting of the afmAsyncDelay parameter. For any dependent operations in the queue, this asynchronous delay helps to filter operations that need not to be sent to the target over an underlying protocol. This filtering of operations saves the bandwidth. For more information, see Asynchronous delay.
When many incoming operations are generated because of heavy workloads, the queue length increases. On the gateway node, messages or operations are stored in the memory and processed for execution, filtering, or queued based on an operation type. For heavy workloads, the memory usage on nodes affects the system performance and the network throughput between cache and home filesets or primary and secondary filesets. When any gateway node fails, you must run demanding recovery and/or resync operations on a new gateway node. When the afmResyncVer2 parameter is enabled queued messages are replicated based on the on-demand dependency resolution to the home or secondary whenever the queued messages are ready. The execution of queue to the home or secondary does not pause on certain dependent operations because the memory is available for more operations after the queues are replicated.
A single-writer (SW) cache fileset or an AFM-DR primary fileset can be configured for the AFM resync version 2 by setting the afmResyncVer2 parameter in the mmchfileset command.
To set or unset the afmResyncVer2 parameter on an AFM or AFM-DR fileset, you need to stop or unlink the fileset. For more information, see Stop and start replication on a fileset.
AFM internally stores all the information necessary to replay the updates that are made in the cache to the home cluster. When a gateway node fails, the in-memory queue for any hosted fileset is lost. Any filesets that were hosted on the failed gateway nodes are transferred to another gateway nodes. The new gateway nodes rebuild queues in the memory. When a gateway node fails, the fileset that is hosted on the gateway node is transferred to another gateway node. This gateway node with the fileset builds the queue in the memory. This process is called recovery. During the recovery, outstanding cache updates are placed in the in-memory queue and the gateway processes the queue. AFM collects the pending operations by running a policy scan on the fileset. AFM uses the policy infrastructure in IBM Storage Scale to engage all the nodes that are mounting the file system to participate in the scan process. Pending requests, which are discovered by the recovery process, are queued in a special queue called the priority queue. At the same time, a normal queue is also created. The normal queue is used for new incoming operations to the fileset.
- Create a file.
- Change the file name.
- Add some data the file.
- Verify the primary fileset
information.
# mmlsfileset fs1 pri --afm -L
The sample output is as follows:Filesets in file system 'fs1': Attributes for fileset pri: ============================ Status Linked Path /gpfs/fs1/pri Id 1 Root inode 524291 Parent Id 0 Created Thu Feb 11 15:05:23 2021 Comment Inode space 1 Maximum number of inodes 100352 Allocated inodes 100352 Permission change flag chmodAndSetacl afm-associated Yes Target nfs://c7f2n06/gpfs/fs1/sec Mode primary Async Delay 15 (default) Recovery Point Objective disable (default) Last pSnapId 1 Number of Gateway Flush Threads 4 Primary Id 2836795238842262449-C0A8693E60255310-1 IO Flags 0x0 (default)
- Stop the primary
fileset.
# mmafmctl fs1 stop -j pri
- Enable the AFM resync version 2
feature.
# mmchfileset fs1 pri -p afmResyncVer2=yes
- Start the primary
fileset.
# mmafmctl fs1 start -j pri
- Verify the primary fileset information
again.
# mmlsfileset fs1 pri --afm -L
The sample output is as follows:Filesets in file system 'fs1': Attributes for fileset pri: ============================ Status Linked Path /gpfs/fs1/pri Id 1 Root inode 524291 Parent Id 0 Created Thu Feb 11 15:05:23 2021 Comment Inode space 1 Maximum number of inodes 100352 Allocated inodes 100352 Permission change flag chmodAndSetacl afm-associated Yes Target nfs://c7f2n06/gpfs/fs1/sec Mode primary Async Delay 15 (default) Recovery Point Objective disable (default) Last pSnapId 1 Number of Gateway Flush Threads 4 Primary Id 2836795238842262449-C0A8693E60255310-1 IO Flags 0x10000 (afmResyncVer2)