Failover steps when active site is inaccessible

Disable network access to your failed active data center, and promote your warm-standby data center to active.

About this task

Follow the steps in this procedure when your active site is inaccessible, or the API Connect subsystems in your active site are not responding to apicup operations.

Procedure

  1. Update the network configuration in your failed data center to isolate the API Connect subsystems in this data center, so that they cannot communicate with each other, nor with the warm-standby data center. Network isolation is necessary to prevent a split-brain situation, which occurs if your active site recovers unexpectedly and starts communicating with your other subsystems.

  2. Promote your warm-standby management subsystem to active:
    apicup subsys set <mgmt_subsystem> multi-site-ha-mode=active
    Apply the change with:
    apicup subsys install <mgmt_subsystem> --force-promotion --skip-health-check
    Monitor the progress of the promotion with:
    apicup subsys health-check <mgmt_subsystem>
    when the command returns no output, the promotion to active is complete.
    Use the -v flag to see more information:
    apicup subsys health-check <subsystem name> -v
  3. Promote your warm-standby portal subsystem to active:
    apicup subsys set <portal_subsystem> multi-site-ha-mode=active
    Apply the change with:
    apicup subsys install <portal_subsystem>
    Monitor the progress of the promotion with:
    apicup subsys health-check <mgmt_subsystem>
    when the command returns no output, the promotion to active is complete.
    Use the -v flag to see more information:
    apicup subsys health-check <subsystem name> -v
  4. Update your dynamic router to redirect all traffic to DC2 instead of DC1.

What to do next

If API Connect in your failed data center cannot be recovered, do not leave your remaining data center as a 2DCDR active with no functioning warm-standby. You have the following options:

If you are able to recover your failed data center, before you re-enable network access, set the management and portal subsystems to warm-standby:

  1. Set the multi-site-ha-mode property to passive for the management subsystem in DC1:
    apicup subsys set <DC1 management> multi-site-ha-mode=passive
  2. Apply the update to DC1:
    apicup subsys install <DC1 management> --accept-dr-data-deletion
    Note: When an active management subsystem is converted to warm-standby, all contents of its management database are deleted (to be replaced by the contents from the other data center when it becomes the active). The --accept-dr-data-deletion flag is acknowledgment that you accept this temporary loss of data.
  3. Monitor the progress of the conversion to warm-standby:
    apicup subsys health-check <DC1 management> -v
  4. Set the multi-site-ha-mode property to passive for the portal subsystem in DC1:
    apicup subsys set <DC1 portal> multi-site-ha-mode=passive
  5. Apply the update to DC1:
    apicup subsys install <DC1 portal> --skip-health-check
  6. Monitor the progress of the conversion to warm-standby:
    apicup subsys health-check <DC1 portal> -v
  7. When you have confirmed that both management and portal subsystems in DC1 are set to warm-standby, you can re-enable the network to the DC1 API Connect subsystems, and the two data centers should synchronize their API Connect data.
    Monitor the health at both data centers with:
    apicup subsys health-check <DC1 management> -v