Role reversal

When a planned or unplanned failover occurs, the role reversal process reverses a secondary site to an acting primary site and the old primary site to the secondary site. The role reversal is recommended to handle the failover.

When a primary site fails, applications can move to a secondary site after the secondary site is promoted to the primary role. The secondary site acts as a primary site for the application. After reversal of role, the secondary site can continue to act as a primary site until the old primary site is back. When the failed site (old primary site) is restored, it takes over the role of the secondary site.

In the traditional “Failback to primary” method, when the failed primary site is restored, workloads are transferred from the acting primary site (old secondary site) to the restored primary site. This method is recommended when the rate of change at the primary is not high, and workloads can be easily transferred between two sites.

In the role reversal method, the secondary site permanently acts as a primary site. After the primary site is restored, it is designated as a secondary site. The role reversal method is recommended when the rate of change at the primary is high, and the filesets are huge. If you adopt the role reversal method, ensure that you remove extra files and old snapshots from the new secondary site.
Important: For a planned or unplanned failure, it is recommended to use the role reversal method instead of failover and failback methods. The failover and failback methods will be deprecated soon.
To reverse the role, do the following steps:
  1. When you plan the role reversal, ensure that the primary and secondary sites are in synchronization.
  2. Issue the following command on the secondary site after the primary site failure:
     mmafmctl Device failoverToSecondary -j FilesetName [--norestore |--restore ]
  3. After the secondary site is promoted to the primary site, move applications on the primary site.
  4. After the old primary site is restored, prepare the old primary site to promote to the secondary site.
  5. If the role reversal is unplanned, ensure that the primary site has dirty files. These files were created or modified but were not replicated to the secondary site because of failures.
    • Do not delete the dirty files that were available at both sites but were not in sync because of the primary site failure. After the role reversal, the acting primary site overwrites these files on the secondary site (the old primary site) when the site is restored and configured.
    • Dirty files that were newly created at the old primary site but not replicated to the secondary site because of the primary site failure. These files are extra files at the old primary (secondary) after the role reversal.
      • These files are overwritten if the same named files are created at the acting primary site and replicated to the secondary (old primary) site.
      • If you choose again the old primary site as a primary site, then these created files are replicated to the secondary site.
  6. To prepare the old primary site, do the following steps:
    1. Unlink the fileset by issuing the following command:
      # mmunlinkfileset fs fileset -f
    2. Disable the role of fileset at the primary site by running the following command:
      # mmchfileset fs fileset -p afmTarget=disable
    3. Link back the fileset by issuing the following command:
      # mmlinkfileset fs fileset -J path_to_fs/fileset
    4. In case of an unplanned role reversal, some changes are not yet replicated from an old primary to a new primary site. However, before the replication, the primary that have an active queue fails. In this case, run the failoverToSecondary parameter as in step 2. If you want to keep the old primary, which is the reversed secondary site, to the same RPO snapshot level, run the mmrestorefs command to revert the fileset to the RPO snapshot level for the application consistency.
      # mmrestorefs fs latestRPO -j fileset
      Note: If the old primary is not restored, it does not cause the data inconsistency at a file system level or at a fileset level. But, it might cause data inconsistency for the application. For example, a file on an old primary with new data changes that are not replicated to an old secondary. If before this replication completes, there is a failure at the old primary and this data does not replicate to the old secondary, and the old primary is not reverted to the RPO snapshot level during the role reversal, the both sites are synchronized because of the following scenarios:
      • Some data of the file is replicated to the old secondary, but some data is still being replicated. In this case, the reverse relationship establishment overwrites the file from the reversed primary to the reversed secondary, and both the sites are synchronized.
      • The file does not exist on the old secondary. It is still in a queue on the old primary when it failed. In this case, one of the following things might happen:
        • In the reverse relationship, the reversed primary has a file with the same name. Because of the same name the new file from the reversed primary is overwritten to the reverse secondary, and both the sites are synchronized.
        • The reverse relationship does not affect the new file. Therefore, the reversed primary does not know whether the new file is on the reversed secondary. For such a new file, when the role is reversed second time, the original primary becomes primary again. The relationship establishment synchronizes the new file to the original secondary and both filesets data becomes consistent at the fileset level and on a snapshot.
  7. Get the primary ID of the fileset at the acting primary site by issuing the following command:
    # mmafmctl fs getPrimaryId -j fileset
  8. Convert the fileset of the old primary site to the fileset of the secondary site by issuing the following command at the old primary:
    # mmafmctl fs convertToSecondary -j fileset --primaryid PrimaryID
  9. Prepare to export this secondary fileset path for use by the acting primary fileset.
  10. Issue the following command at the acting primary to build the relationship:
    # mmafmctl fs changeSecondary -j fileset --new-target oldPrimary:filesetpath --inband
  11. Remove old snapshots from the old primary because the old snapshots contain old data.
The primary site (the old secondary site) is now the primary site and the old primary site is the secondary site. The roles of the primary site and the secondary site are interchanged and the relationship is restored.
Note: The role reversal method is used if filesets are large, and a high rate of change at the primary site.