Introduction

AFM-based asynchronous disaster recovery (AFM DR) is a fileset-level replication disaster-recovery capability.

Important: Our initial feedback from the field suggests that success of a disaster recovery solution depends on administration discipline, including careful design, configuration and testing. Considering this, IBM® has decided to disable the Active File Management- based Asynchronous Disaster Recovery feature (AFM DR) by default and require that customers deploying the AFM DR feature first review their deployments with IBM Spectrum Scale™ development. You should contact IBM Spectrum Scale Support at scale@us.ibm.com to have your use case reviewed. IBM helps to optimize your tuning parameters and enable the feature. Please include this message while contacting IBM Support.

For more information, see Flash (Alert): IBM Spectrum Scale (GPFS™) V4.2 and V4.1.1 AFM Async DR requirement for planning.

The Disaster recovery solution includes:
  • providing business stability during a disaster
  • restoring business stability after the disaster has been repaired
  • enduring multiple disasters
  • minimizing data loss in the event of a disaster

AFM-based asynchronous disaster recovery is an AFM-based fileset-level replication disaster-recovery capability that augments the overall business recovery solution. This capability is a one-to-one active-passive model and is represented by two sites: primary and secondary.

The primary site is a read-write fileset where the applications are currently running and have read-write access to the data. The secondary site is read-only. All the data from the primary site is asynchronously synchronized with the secondary site. The primary and secondary sites can be independently created in storage and network configuration. After the sites are created, you can establish a relationship between the two filesets. The primary site is available for the applications even in the event of communication failures or secondary failures. When the connection with the secondary site is restored, the primary site detects the restored connection and asynchronously updates the secondary site.

The following data is replicated from the primary site to the secondary site:
  • File-user data
  • Metadata including the user-extended attributes except the inode number and atime
  • Hard links
  • Renames
The following file system and fileset-related attributes from the primary site are not replicated to the secondary:
  • User, group, and fileset quotas
  • Replication factors
  • Dependent filesets

A consistent view of the data in the primary fileset can be propagated to the secondary fileset by using fileset-based snapshots (psnaps). Recovery Point Objective (RPO) defines the frequency of these snapshots and can send alerts through events when it is unable to achieve the set RPO. RPO is disabled by default. The minimum time you can set as RPO is 720 minutes. AFM-based Asynchronous DR can reconfigure the old primary site or establish a new primary site and synchronize it with the current primary site.

Start of changeIndividual files in the AFM DR filesets can be compressed. Compressing files saves disk space. For more information, see File compression.End of change

Start of changeSnapshot data migration is also supported. For more information, see ILM for snapshots.End of change

In the event of a disaster at the primary site, the secondary site can be failed over to become the primary site. When required, the filesets of the secondary site can be restored to the state of the last consistent RPO snapshot. Applications can be moved or failed over to the acting primary site. This helps to ensure stability with minimal downtime and minimal data loss. This makes it possible for applications to eventually be failed back to the primary site as soon as the (new) primary is on the same level as the acting primary.
Note: Start of changeAFM DR does not offer any feature to check consistency of files across primary and secondary. However, you can use any third-party utility to check consistency after files are replicated.End of change