Returning production to Site A after planned and unplanned outages (failback)

Returning production to a site is called a recovery failback. After Site A is restored, you can schedule a failback operation to synchronize data and resume production.

Before you begin

Before you run a failback operation, you must create paths from Site B to Site A between the specific LSSs.

About this task

If Site A is operational and connectivity from Site B to Site A is available, you can use this procedure to restart production with Site B volumes. See Table 1 for an example of failover and failback operations.
Note: The procedure to move production back to the local site (Site A) is identical for planned and unplanned outages.
The failback operation synchronizes the volumes in the following manner, depending on the volume state:
  • If a volume at Site A is in simplex state, all of the data for that volume is sent from Site B to Site A.
  • If a volume at Site A is in full-duplex or suspended state and without changed tracks, only the modified data on the volume at Site B is sent to the volume at Site A.
  • If a volume at Site A is in a suspended state but has tracks that have been modified, the volume at Site B discovers which tracks were modified at any site and sends the following data from Site B to Site A:
    • The tracks that were changed on Site A.
    • The tracks that were marked at Site B.
The following scenario typically applies:
  • Paths from Site B to Site A are created.
  • Remote mirror and copy volume pairs are created. Site B volume is the source volume of the failback operation. This volume was initially the target volume of the relationship.
Note: The recovery failback can operate on any remote mirror and copy volume that is in a primary suspended state. The operation copies required data from the source volume to the target volume to resume mirroring. Recovery failback operations are initiated after a failover operation. Failback restarts mirroring from the local to remote site, or in the reverse direction. The target volume can be in simplex state.

Procedure

Complete the following steps to switch back to a restored site:

  1. Run a recovery failback operation with Site B volumes.
    This process copies all changed tracks from the target volumes back to the source volumes and copies over any tracks that were modified on the original source volumes.
  2. Before operations are normalized, quiesce applications (still updating volumes at Site B) to cease all write I/O from updating the source volumes.
    Note: On some host systems, such as AIX® and Linux®, applications that access FlashCopy source volumes must be quiesced before FlashCopy operations are run. Then, the source volumes must be unmounted while FlashCopy is established. This action ensures that no data (that can corrupt the target volumes) exists in the buffers. Depending on the host operating system, it might be necessary to unmount the source volumes.
  3. From Site A, run a recovery failover with the source volumes.
    This process converts the full-duplex target volumes at the Site A to suspended source volumes. The volumes at Site A start the change recording process while in failover mode.
  4. Depending on your operating system, it might be necessary to rescan Fibre Channel devices and mount the new source volumes at Site A.
  5. From Site A, run another recovery failback with the source volumes. This process synchronizes the volumes at Site A with volumes at Site B.
  6. Mount your volumes at Site A and start your applications on your primary server.