Failback procedure if new backups were taken after failover to DR site

Procedure

  1. Stop the DR system by running:
    apstop
    Watch ap state -d until the system is stopped.
  2. Unmount the ext_mnt file system on the DR system by running:
    mmunmount ext_mnt -a
    Monitor mmlsmount all -L until ext_mnt is no longer showing up as mounted on any nodes.
  3. Export the GPFS file system to a metadata file by running:
    mmexportfs ext_mnt -o /home/ext_san.config
  4. scp the metadata file to the prod system at e1n1:/home/ext_san.config. Then on the prod system, scp this file to all connector nodes to the same directory so that copies exist on multiple nodes.
  5. Stop the prod system by running:
    apstop
    Watch ap state -d until the system is stopped.
  6. Unmount the ext_mnt file system on the prod system by running:
    mmunmount ext_mnt -a
    Monitor mmlsmount all -L until ext_mnt is no longer showing up as mounted on any nodes.
  7. Delete the ext_mnt file system on the prod system by running:
    mmdelfs ext_mnt
    Watch mmlsfs ext_mnt until the command indicates that ext_mnt is deleted.
  8. Import the file system from the metadata file on the prod system by running:
    mmimportfs ext_mnt -i /home/ext_san.config
  9. Mount the ext_mnt file system on the prod system by running:
    mmmount ext_mnt -a
  10. After mounting the ext_mnt file system on the prod system, run:
    mmlsmount all -L
    Watch until ext_mnt is mounted on all expected nodes (typically five nodes when two connector nodes exist).
  11. Get the prod system online and ready for use by running:
    apstart
    1. Watch ap state -d and verify that ap apps shows VDB as ENABLED. If not enabled, run:
      ap apps enable vdb
      Watch ap apps until it is ENABLED.
    2. Verify that ap node -d shows one of the connector nodes as VDB_MASTER.
    3. ssh the connector node that is VDB_MASTER. For example, if enclosure7.node1 is VDB_MASTER, then ssh to e7n1.
    4. Enter the NPS® container: docker exec -it ipshost1 bash.
    5. Monitor nzstate -local until it is online.
      Note: The estimated time for nzstate -local to be online is 10 to 45 minutes.
    6. nzrestore from the data that is backed up and replicated to the prod site SAN. While the nzrestore is in progress, you can proceed with the following step.
  12. Start the DR system by running:
    apstart
    Watch ap state -d for it to be online and can be monitored for readiness and health.

    With this step, The failback is complete. You can activate replication going from prod to DR and resume production from the production site.