Preparing the management subsystem for disaster recovery on VMware using SFTP backups (10.0.1.1-eus)

Prepare a Management subsystem for disaster recovery from SFTP backups by taking specific steps before and after the installation of the subsystem.

About this task

  • This task must be performed before any disaster event occurs, and also prior to the installation of a replacement Management subsystem during the recovery process. Best practice is to complete these steps immediately after initial configuration of a management subsystem during your original version 10 deployment.
  • This procedure is designed to work with SFTP backup storage only. For S3 backup storage, see Preparing the management subsystem for disaster recovery on VMware using S3 backups (10.0.1.1-eus).
  • Any local backups are presumed to have been lost in the disaster scenario and are non-recoverable in this procedure.
Important: Successful disaster recovery depends on recovery of both the Management subsystem and the Developer Portal subsystem. You must complete preparation steps for both subsystems in order to achieve disaster recovery. If you have to perform a restore, you must complete the restoration of the Management Service first, and then immediately restore the Developer Portal. Therefore, the backups of the Management and Portal must be taken at the same time, to ensure that the Portal sites are consistent with Management database.

Procedure

  1. Confirm that you have access to the Kubernetes cluster on your Management Appliance:
    1. Log onto a Management Appliance node:
      
      ssh -i ~/.ssh/id_rsa -o CheckHostIP=no apicadm@example.hostname.com
      sudo -i
      
    2. Confirm you are able to run kubectl commands against your Kubernetes cluster
      $ kubectl get nodes
      NAME           STATUS   ROLES    AGE     VERSION
      apimdevr0050   Ready    master   6h17m   v1.16.8
      apimdevr0068   Ready    master   6h13m   v1.16.8
      apimdevr0089   Ready    master   6h10m   v1.16.8
  2. After installation, you must back up essential Kubernetes secrets that are used by the Management subsystem.

    The Kubernetes secrets are created during installation. You must save these in case a restore is needed after a disaster has occurred. At that time, you will use them in the setup of the restored Management subsystem:

    1. Obtain and save the Management database encryption secret.

      Use the following command to find the name of the secret in the status of the Management Subsystem. Replace <namespace> with the namespace used for the subsystem installation:

      kubectl get mgmt -n <namespace> -o yaml | grep encryption

      The output of this command shows the name of the Kubernetes secret storing the database encryption key, for example:

      encryptionSecret: management-enc-key

      In this case, the name of the Management database encryption secret is management-enc-key. Use the following command to back up this secret locally:

      kubectl get secret management-enc-key -n <namespace> -o yaml > management_enc_key.yaml

      The secret is stored in a file called management_enc_key.yaml, located in the present working directory.

    2. Obtain and save the Management client application credential secrets.

      As part of Installing the Management subsystem, you must store a set of credentials for Disaster Recovery. These credentials are required to ensure that users can log in to API Connect after recovering the database, as these credentials cannot be reconstituted by a new Management subsystem.

      Important: If you do not have these credentials, Disaster Recovery is not possible.
      1. Use the following command to find the names of the secrets in the status of the Management subsystem:
        kubectl get mgmt -n <namespace> -o yaml | grep CredentialSecret

        The output of this command shows the names of the Kubernetes Secrets used to store the various client application credentials. For example:

        atmCredentialSecret: management-atm-cred
        consumerToolkitCredentialSecret: management-ccli-cred
        consumerUICredentialSecret: management-cui-cred
        designerCredentialSecret: management-dsgr-cred
        juhuCredentialSecret: management-juhu-cred
        toolkitCredentialSecret: management-cli-cred
        uiCredentialSecret: management-ui-cred

        Here, for example, the name of the ATM Client Application Credential Secret is management-atm-cred.

      2. Next, back up all of the secrets locally. Use the following command for each listed Credential Secret, replacing <secret_name> with the secret name listed each time:
        kubectl get secret <secret_name> -n <namespace> -o yaml > <secret_name>.yaml

        For example, use the following command to backup the ATM Client Credential Secret:

        kubectl get secret management-atm-cred -n <namespace> -o yaml > management-atm-cred.yaml
      3. Once you have saved each of these client application Credential Secrets locally, open each of the YAML files saved for each secret in turn. Remove both the ownerReferences subsection and the selfLink property. Re-save the file.

        For example, the ownerReferences and selfLink properties to be removed appear in the YAML files similar to the following:

        ownerReferences:
          - apiVersion: management.apiconnect.ibm.com/v1beta1
            blockOwnerDeletion: true
            controller: true
            kind: ManagementCluster
            name: management
            uid: 623e6b20-7eb8-46ce-94ac-6b64cd71afc4
          selfLink: /api/v1/namespaces/default/secrets/management-atm-cred
    3. Ensure that all of these YAML files that contain the various backed-up Secrets are stored persistently and safely.
      Important: If the files are lost, you cannot restore after a disaster event.
    4. Obtain the endpoints of your Management subsystem.

      In Disaster Recovery, the endpoints of the old Management installation must match that of the new Management installation:

      kubectl get managementcluster management -o jsonpath="{..apiManagerEndpoint.hosts[0].name}"
      
      kubectl get managementcluster management -o jsonpath="{..cloudManagerEndpoint.hosts[0].name}"
      
      kubectl get managementcluster management -o jsonpath="{..consumerAPIEndpoint.hosts[0].name}"

      Save this output to a safe place.

    5. Take note of the following values in the Management subsystem CR. You will need them to restore the Management subsystem.
      Table 1. Management subsystem CR settings
      Setting Example value
      name management
      siteName 82b290a2
      databaseBackup.protocol sftp
      databaseBackup.host <SFTP-host-name>
      databaseBackup.path <SFTP-path>
      databaseBackup.schedule 0 3 * * *
      databaseBackup.credentials secret-containing-SFTP-credentials
      • The values name and siteName form the name of the database cluster name, such as management-82b290a2-postgres and is needed to synchronize the backups from your old Management subsystem to your new one.
      • The databaseBackup.host and databaseBackup.path settings must be identical to what was configured on your old Management subsystem. This is the location of your old Management subsystem backups from where we will recover from.
      • Take note of databaseBackupschedule. The schedule will be included during the restoration.
  3. Make sure that you have a backup that can be used in case of a disaster event:
  4. Optional: Take a Virtual Machine (VM) snapshot of all your VMs; see Using VM snapshots for infrastructure backup and disaster recovery for details. This action does require a brief outage while all of the VMs in the subsystem cluster are shut down - do not take snapshots of running VMs, as they might not restore successfully. VM snapshots can offer a faster recovery when compared to redeploying OVAs and restoring from normal backups.
    Important: VM snapshots are not an alternative to the standard backups that are described in the previous step, and which must be taken in order to use the API Connect restore feature.

What to do next

You should now complete the preparation steps for the Developer Portal subsystem; see Preparing the Developer Portal subsystem for disaster recovery on VMware.