Role reversal
When a planned or unplanned failover occurs, the role reversal process reverses a secondary site to an acting primary site and the old primary site to the secondary site. The role reversal is recommended to handle the failover.
When a primary site fails, applications can move to a secondary site after the secondary site is promoted to the primary role. The secondary site acts as a primary site for the application. After reversal of role, the secondary site can continue to act as a primary site until the old primary site is back. When the failed site (old primary site) is restored, it takes over the role of the secondary site.
In the traditional “Failback to primary” method, when the failed primary site is restored, workloads are transferred from the acting primary site (old secondary site) to the restored primary site. This method is recommended when the rate of change at the primary is not high, and workloads can be easily transferred between two sites.
- When you plan the role reversal, ensure that the primary and secondary sites are in synchronization.
- Issue the following command on the secondary site after the primary site
failure:
mmafmctl Device failoverToSecondary -j FilesetName [--norestore |--restore ]
- After the secondary site is promoted to the primary site, move applications on the primary site.
- After the old primary site is restored, prepare the old primary site to promote to the secondary site.
- If the role reversal is unplanned, ensure that the primary site has dirty files. These files
were created or modified but were not replicated to the secondary site because of failures.
- Do not delete the dirty files that were available at both sites but were not in sync because of the primary site failure. After the role reversal, the acting primary site overwrites these files on the secondary site (the old primary site) when the site is restored and configured.
- Dirty files that were newly created at the old primary site but not replicated to the secondary
site because of the primary site failure. These files are extra files at the old primary (secondary)
after the role reversal.
- These files are overwritten if the same named files are created at the acting primary site and replicated to the secondary (old primary) site.
- If you choose again the old primary site as a primary site, then these created files are replicated to the secondary site.
- To prepare the old primary site, do the following steps:
- Unlink the fileset by issuing the following
command:
# mmunlinkfileset fs fileset -f
- Disable the role of fileset at the primary site by running the following
command:
# mmchfileset fs fileset -p afmTarget=disable
- Link back the fileset by issuing the following
command:
# mmlinkfileset fs fileset -J path_to_fs/fileset
- In case of an unplanned role reversal, some changes are not yet
replicated from an old primary to a new primary site. However, before the replication, the primary
that have an active queue fails. In this case, run the failoverToSecondary
parameter as in step 2. If you
want to keep the old primary, which is the reversed secondary site, to the same RPO snapshot level,
run the mmrestorefs command to revert the fileset to the RPO snapshot level for
the application
consistency.
# mmrestorefs fs latestRPO -j fileset
Note: If the old primary is not restored, it does not cause the data inconsistency at a file system level or at a fileset level. But, it might cause data inconsistency for the application. For example, a file on an old primary with new data changes that are not replicated to an old secondary. If before this replication completes, there is a failure at the old primary and this data does not replicate to the old secondary, and the old primary is not reverted to the RPO snapshot level during the role reversal, the both sites are synchronized because of the following scenarios:- Some data of the file is replicated to the old secondary, but some data is still being replicated. In this case, the reverse relationship establishment overwrites the file from the reversed primary to the reversed secondary, and both the sites are synchronized.
- The file does not exist on the old secondary. It is still in a queue on the old primary when it
failed. In this case, one of the following things might happen:
- In the reverse relationship, the reversed primary has a file with the same name. Because of the same name the new file from the reversed primary is overwritten to the reverse secondary, and both the sites are synchronized.
- The reverse relationship does not affect the new file. Therefore, the reversed primary does not know whether the new file is on the reversed secondary. For such a new file, when the role is reversed second time, the original primary becomes primary again. The relationship establishment synchronizes the new file to the original secondary and both filesets data becomes consistent at the fileset level and on a snapshot.
- Unlink the fileset by issuing the following
command:
- Get the primary ID of the fileset at the acting primary site by issuing the following
command:
# mmafmctl fs getPrimaryId -j fileset
- Convert the fileset of the old primary site to the fileset of the secondary site by issuing the
following command at the old
primary:
# mmafmctl fs convertToSecondary -j fileset --primaryid PrimaryID
- Prepare to export this secondary fileset path for use by the acting primary fileset.
- Issue the following command at the acting primary to build the
relationship:
# mmafmctl fs changeSecondary -j fileset --new-target oldPrimary:filesetpath --inband
- Remove old snapshots from the old primary because the old snapshots contain old data.