Restoring the management subsystem on VMware after a disaster (10.0.1.0-eus)

Restore the management subsystem after a disaster event.

Before you begin

In a disaster recovery scenario where an install of a new Management subsystem is required, the following steps will guide you through how to recover your previous Management subsystem's data onto the new Management subsystem.

This task should be performed before the install of the new Management subsystem.

Note:

This procedure is designed to work with SFTP backup storage only. Any backups not in remote storage prior to disaster are presumed to have been lost and are non-recoverable in this procedure.

Important: Successful disaster recovery depends on recovery of both the Management subsystem and the Developer Portal subsystem. You must complete preparation steps for both subsystems in order to achieve disaster recovery. If you have to perform a restore, you must complete the restoration of the Management Service first, and then immediately restore the Developer Portal. Therefore, the backups of the Management and Portal must be taken at the same time, to ensure that the Portal sites are consistent with Management database.

Procedure

  1. Locate the saved credentials for the Management subsystem:
    Warning:
    • If you do not have the credentials, Disaster Recovery is not possible.
    • During installation of API Connect, you were required to store a set of credentials for Disaster Recovery. These credentials are required in this restore procedure to ensure that users can login into API Connect after recovering the database. Note that these credentials cannot be reconstituted by a new Management subsystem.
    • The Kubernetes Secrets for the following credentials should have been exported and saved as YAML files as part of the backup-procedure before the disaster event. See Preparing the management subsystem for disaster recovery on VMware (10.0.1.0-eus)

      The set of Management Client Application Credentials required are as follows:

      • atm-cred (Test & Monitor Credential)
      • ccli-cred (Consumer Toolkit Credential)
      • cli-cred (Toolkit Credential)
      • cui-cred (Consumer UI Credential)
      • dsgr-cred (Designer Credential)
      • juhu-cred (Juhu Credential)
      • ui-cred (UI Credential)
    • The Management Database Encryption Secret is also required, which also should have been exported and saved a YAML file as part of the backup-procedure before the disaster event. See Preparing the management subsystem for disaster recovery on VMware (10.0.1.0-eus).
  2. Determine which backup to use when restoring.

    Ensure you have a backup available in remote storage to restore from. Restoring your Management subsystem's data requires a backup containing the data you wish to restore.

    Note:

    Avoid restoring to initial system backup. During install, Management db will take an initial system backup before certain database schema jobs are complete. Restoring to this backup will result in an unstable system.

    For more information on remote sftp backups, see Using an sftp server for backup files on VMware

    List the backup files available: For example:

    $ /sftp/backup.sh -L -H $HOST -u $USERNAME -p $PASSWORD -d $DIRECTORY
    drwxr-xr-x    2 root     root           90 Aug 26 09:21 .
    drwxrwxrwt   10 root     root          198 Aug 26 09:45 ..
    -rw-r--r--    1 root     root     13092333 Aug 26 08:56 20200826-154646F.tgz
    -rw-r--r--    1 root     root     18703758 Aug 26 09:10 20200826-160010F.tgz
    -rw-r--r--    1 root     root     24318561 Aug 26 09:21 20200826-161301F.tgz

    Take note of the filename of the backup you wish to restore to. Each filename contains the date, time, and type of the backup stored. This will be used later in the procedure.

  3. Determine the old Management subsystem database name

    Take note of the following values belonging to the old Management subsystem CR. They will be needed to recover that data from your old Management subsystem.

    Table 1. Management database values
    Value Example
    name management
    siteName 82b290a2

    The Management subsystem name and siteName property is also required, which is used to form the name of the database cluster name. (for example: management-82b290a2-postgres) and is needed to synchronize the backups from your old Management subsystem to your new one.

    For sitename, see Preparing the management subsystem for disaster recovery on VMware (10.0.1.0-eus)

  4. From your new project directory, install the Management subsystem. See Deploying the Management subsystem
    Warning:
    • If the disaster event only affects Management appliances, it is safe to reuse the existing project for Installation of Management subsystem. In the event all appliances are affected and the project itself is lost, a new project will be required for the install of new subsystems.
    • Ensure that the Management subsystem name remains the same in the new installation as the original installation.
    • Ensure that the hostnames of the endpoints remain the same in the Management subsystem used for installation now as they were for the original installation. The hostnames for the endpoints cannot be changed.
  5. Confirm you are able to run kubectl commands against your Kubernetes cluster.
    1. Log onto a Management Appliance node
      
      ssh -i ~/.ssh/id_rsa -o CheckHostIP=no apicadm@example.hostname.com
      sudo -i
      
    2. Run kubectl:
      $ kubectl get nodes
      NAME           STATUS   ROLES    AGE     VERSION
      apimdevr0050   Ready    master   6h17m   v1.16.8
      apimdevr0068   Ready    master   6h13m   v1.16.8
      apimdevr0089   Ready    master   6h10m   v1.16.8
  6. Complete the following steps before you restore onto new management subsystem:
    1. SSH onto your Management appliance and remove the Management subsystem:
      root@apimdev0076:/# kubectl get managementcluster
      NAME             READY   STATUS    VERSION    RECONCILED VERSION   AGE
      management    16/16   Running   10.0.1.1   10.0.1.1-706         5h28m
      
      root@apimdev0076:/# kubectl delete managementcluster management
      managementcluster.management.apiconnect.ibm.com "management" deleted
      
    2. Apply the saved YAML file that contains the Management Database Encryption Secret into the cluster. For example:
      kubectl create -f encryption-bin-secret.yaml

      where encryption-bin-secret.yaml is the local YAML file containing the backup-up encryption secret.

      This command re-creates the original Management Database Encryption Secret on the cluster. It is named as the original name of the Secret.

    3. Apply into the cluster each of the saved YAML files that contain the Management Client Application Credential Secrets:
      kubectl create -f <secret_name>.yaml

      The parameter <secret_name> is the local YAML file containing one of the backed-up Credential Secrets.

      Repeat this command for each of the backed-up Credential Secrets. These commands re-create the original Management Client Application Credential Secrets on the cluster. Each will be named as the original name of the Secret.

    4. In your project directory, create the management-extra-values.yaml file for the Management subsystem.
      • Add the encryptionSecret subsection to the spec, which is the name of the newly created secret on the cluster containing the original Management Database Encryption Secret from the previous step.
      • Add the siteName property to the spec, where in this example, 82b290a2 is the original siteName that was noted after the installation of the original Management Subsystem
      • Add the customApplicationCredentials subsection to the spec. For each named credential above, the secretName is given as the corresponding name of the newly created Secret from Step 6.c for each of the above Credential Secrets.
      spec:
                customApplicationCredentials:
                - name: atm-cred
                  secretName: management-atm-cred
                - name: ccli-cred
                  secretName: management-ccli-cred
                - name: cli-cred
                  secretName: management-cli-cred
                - name: cui-cred
                  secretName: management-cui-cred
                - name: dsgr-cred
                  secretName: management-dsgr-cred
                - name: juhu-cred
                  secretName: management-juhu-cred
                - name: ui-cred
                  secretName: management-ui-cred
                encryptionSecret:
                  secretName: management-enc-key
                siteName: 82b290a2
    5. Set the management-extra-values file in your Management subsystem:
      $ apicup subsys set management extra-values-file management-extra-values.yaml
  7. Install the Management subsystem using apicup with the flag --skip-health-check:
    $ apicup subsys install management --skip-health-check
  8. Confirm the backup file you identified in step 2 is present in the sftp server.
  9. Download the backup onto the Management subsystem. See Downloading backup files from an sftp server.
  10. Confirm there is a ManagementBackup of type record your downloaded backup:
    $ kubectl get ManagementBackup
    kubectl get mgmtb
    NAME                STATUS   ID                 CLUSTER                        SUBSYSTEM   TYPE   CR TYPE   AGE
    mgmt-backup-8hqqg   Ready    20200826-154646F   management-82b290a2-postgres   management  full   record    40s
  11. Perform a Management Restore using the name of the backup that has the ID we want to restore to.

    Once the Management Restore has completed and the database is running again, the data of the old Management subsystem will be successfully restored onto the new Management subsystem.

Results

The Management subsystem will be restored to its state prior to the disaster event.

What to do next

You should now complete the recovery steps for the Developer Portal subsystem on VMware, see Recovering the Developer Portal subsystem on VMware.