Recovering from a failover of a two data center deployment
What to do after completing a 2DCDR failover.
Ensure that you have read and understand the concepts of 2DCDR. See Two data center deployment strategy on Kubernetes and OpenShift and Key concepts of 2DCDR and failure scenarios.
This topic describes what to do after you have completed a failover of your 2DCDR deployment, as described in How to failover API Connect from the active to the warm-standby data center.
If after the failover operation,
the failed data center was successfully updated to warm-standby, then verify
that replication is working: Verifying replication between data centers. If replication is
working, you can either:
- Revert your 2DCDR deployment to the original active and warm-standby data center designations. To revert your deployment, follow the same failover steps: How to failover API Connect from the active to the warm-standby data center.
- Do nothing, and continue with your current active and warm-standby data center designations.
If your failed data center could not be updated to warm-standby, then ensure that the network links between the data centers are disabled. If the network links remain enabled then a split-brain could occur if your failed data center recovers before you are able to set it to warm-standby.
When you are able to recover the failed data center, ensure that API Connect is set to warm-standby before restoring the network connectivity to the active data center.
If you expect your failed data center to be down for a long time, then convert your active data
center to a stand-alone deployment:
- Remove the
multiSiteHA
section from thespec
sections of your management and portal CRs.
- If API Connect is
still working on the recovered data center, re-enable 2DCDR as follows:
- Add the
multiSiteHA
section back to your working stand-alone deployment. Ensure that it is set toactive
. - Ensure that the
multiSiteHA
section is set to warm-standby (passive
) on the recovered data center. - Restore the network links between data centers.
- Add the
- If the API Connect
installation on the failed data center is not recoverable, then reinstall API Connect on this data
center, and re-enable 2DCDR
as follows:
- Add the
multiSiteHA
section back to your working stand-alone deployment. Ensure that it is set toactive
. - Ensure that the
multiSiteHA
section is set to warm-standby (passive
) on the recovered data center. - Copy the
ingress-ca
X.509 certificate from the active data center and apply it on your reinstalled warm-standby data center. For more information, see Check the ingress-ca X.509 certificates match. - Restore the network links between data centers.
- Add the
Note: If your original active data center was in a failed state for some time, when it is
recovered to warm-standby
state it will take time for the data from your active data center to replicate across. The time that
is taken depends on the size of your management and portal databases and how many changes were made
to them while your original active data center was in a failed state.