Introduction

Active File Management-based asynchronous disaster recovery (AFM DR) is a fileset-level replication disaster-recovery capability.

The disaster recovery solution includes:
  • Providing business stability during a disaster
  • Restoring business stability after the disaster is repaired.
  • Enduring multiple disasters
  • Minimizing data loss because of a disaster

AFM-based asynchronous disaster recovery is an AFM-based fileset-level replication disaster-recovery capability that augments the overall business recovery solution. This capability is a one-to-one active-passive model and is represented by two sites: primary and secondary.

The primary site is a read/write fileset where the applications are currently running and has read/write access to the data. The secondary site is read-only. All the data from the primary site is asynchronously synchronized with the secondary site. The primary and secondary sites can be independently created in storage and network configuration. After the sites are created, you can establish a relationship between the two filesets. The primary site is available for the applications even when communication or secondary fails. When the connection with the secondary site is restored, the primary site detects the restored connection and asynchronously updates the secondary site.

The following data is replicated from the primary site to the secondary site:
  • File-user data
  • Metadata including the user-extended attributes except the inode number and a time
  • Hard links
  • Renames
The following file system and fileset-related attributes from the primary site are not replicated to the secondary:
  • User, group, and fileset quotas
  • Replication factors
  • Dependent filesets
AFM DR can be enabled on GPFS-independent filesets only.
Note: An independent fileset that has dependent filesets cannot be converted into an AFM DR fileset.

A consistent view of the data in the primary fileset can be propagated to the secondary fileset by using fileset-based snapshots (psnaps). Recovery Point Objective (RPO) defines the frequency of these snapshots and can send alerts through events when it is unable to achieve the set RPO. RPO is disabled by default. The minimum time that you can set as RPO is Start of change60End of change minutes. AFM-based asynchronous DR can reconfigure the old primary site or establish a new primary site and synchronize it with the current primary site.

Individual files in the AFM DR filesets can be compressed. Compressing files saves disk space. For more information, see File compression.

Snapshot data migration is also supported. For more information, see ILM for snapshots.

When a disaster occurs on the primary site, the secondary site can be failed over to become the primary site. When required, the filesets of the secondary site can be restored to the state of the last consistent RPO snapshot. Applications can be moved or failed over to the acting primary site. This application movement helps to ensure stability with minimal downtime and minimal data loss. This makes it possible for applications to eventually be failed back to the primary site as soon as the (new) primary is on the same level as the acting primary.
Note: AFM DR does not offer any feature to check consistency of files across primary and secondary sites. However, you can use any third-party utility to check that consistency after files are replicated.

You can simultaneously configure a site for continuous replication of IBM Storage Scale data along with AFM DR site. With IBM Storage Scale continuous replication, you can achieve a near disaster recovery and a far disaster recovery with AFM DR site.