IBM Cloud Pak foundational services backup and restore for coexistence scenario

A coexistence scenario involves multiple instances of foundational services on one cluster with at least one instance on version 3.23.x or 3.19.9 or later, and at least one instance on version 4.0 or later. You can schedule backup and restore of foundational services by using the Red Hat OpenShift API for Data Protection (OADP) operator. Make sure that you use the stable-1.3 channel of the OADP operator.

Prerequisites

Note: If the cluster that is being backed up or restored to uses s390x for architecture, any velero CLI commands must be run on an alternate cluster that does not use s390x and has oc access to the original (usually by using oc login). Velero CLI does not yet support s390x.

Backing up foundational services

Complete the following steps to back up the installed foundational services.

Create the backup resources

You need the following resources for completing the backup procedures.

  1. Log in to your OpenShift cluster command-line interface (CLI) by using the oc login command.

  2. Create a namespace for Velero objects. The following example creates the velero namespace. For more information about Velero, see Velero documentation Opens in a new tab.

     oc project velero
    
  3. Install the Red Hat OADP operator in the velero namespace. For more information, see About installing OADP Opens in a new tab.

  4. Create a secret named cloud-credentials with the access key id and secret access key credentials.

    1. Open any editor and place the following credentials in a file named, credentials-velero.

       vi credentials-velero
      
    2. Insert the following content in the file:

       [default]
       aws_access_key_id=<access_key_id>
       aws_secret_access_key=<secret_access_key>
      
    3. Create the secret.

      oc create secret generic cloud-credentials -n velero --from-file cloud=credentials-velero
      
  5. From your OpenShift cluster console OperatorHub page, install the OADP operator from the stable-1.3 channel, which provides the Velero 1.9 API. The API is needed for foundational services backup and restore. For more information, see OpenShift Container Platform documentation Opens in a new tab.

  6. Create a DataProtectionApplication object.

    Note: The provider is aws even if you are not using AWS Object Storage.

apiVersion: oadp.openshift.io/v1alpha1
kind: DataProtectionApplication
metadata:
  name: <resource_name>
  namespace: velero
  annotations:
    argocd.argoproj.io/sync-options: SkipDryRunOnMissingResource=true
    argocd.argoproj.io/sync-wave: '20'
spec:
  backupLocations:
    - velero:
        config:
          profile: default
          region: <bucket_region>
          s3ForcePathStyle: 'true'
          s3Url: <s3_URL>
        credential:
          key: cloud
          name: cloud-credentials
        default: true
        objectStorage:
          bucket: <bucket_name>
          prefix: <root_directory_name>
        provider: aws
  configuration:
    restic:
      enable: true
    velero:
      defaultPlugins:
        - openshift
        - aws
      podConfig:
        resourceAllocations:
          limits:
            cpu: '1'
            memory: 1Gi
          requests:
            cpu: 500m
            memory: 512Mi

Add labels to resources

You can add labels to resources automatically by running the script or by manually adding labels. Complete one of the following procedures.

Labelling the resources automatically by running the scrip

  1. Run the following commands to fetch and download the env.properties file and the label-common-services.sh script and save them in the same folder.

     wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/backup/common-service/label-common-service.sh
     wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/backup/common-service/env.properties
    
  2. Open the env.properties file, edit the required variables and save the changes.

    Note: The OPERATOR_NS="" variable must be properly set for the script to work. Other variables have default values. You can change these values to fit your environment.

     vi env.properties
    

    The env.properties file contains the following variables:

     # Change the following values to fit your environment.
     OPERATOR_NS="" # Set the parameter to the namespace where the foundational services operator is installed.
    
     # Pass the namespace where foundational services are installed.
     # Leave the value of this pameter empty if the services are installed in the same namespace as the foundational services operator.
     SERVICES_NS="" 
    
     # Pass the control namespace if it is needed to be backed up.
     CONTROL_NS=""
    
     # Change to the namespace where cert-manager, License Service and License Service Reporter are installed if they are istalled in custom namespaces.
     CERT_MANAGER_NAMESPACE="ibm-cert-manager" 
     LICENSING_NAMESPACE="ibm-licensing"
     LSR_NAMESPACE="ibm-lsr"
    
     # Change to 1 to enable the private catalog if required.
     ENABLE_PRIVATE_CATALOG=0
    
     # Add additional CatalogSources without the ".spec.publisher: IBM" parameter. Separate the CatalogSources with a comma.
     # For example: "my-catalog,my-catalog2,my-catalog3"
     ADDITIONAL_SOURCES=""
    
  3. Use the following command to run the label-common-service.sh script.

     ./label-common-service.sh
    

Manually adding labels to resources

Before you begin, set the namespace where you installed foundational services as the default namespace.

oc project <namespace-where-foundational services-are-installed>

You must label the currently installed resources to identify them during restoration.

Back up MongoDB (CS v3.19.x to v4.5.x)

Set up the MongoDB backup deployment. The deployment triggers and holds the MongoDB database backup. During restore, the deployment also triggers the restore of the MongoDB database.

Note: Repeat this step for each namespace where foundational services is installed.

  1. Get the mongodb-backup-pvc.yaml and mongodb-backup-deployment.yaml files. Place them in the same directory.
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/mongodb-backup-pvc.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/mongodb-backup-deployment.yaml
  1. Update the backup files.

    By default, the mongodb-backup-pvc.yaml needs to replace the following parameters:

    • Replace <mongo namespace> with the namespace where the MongoDB deployment that foundational services uses is running.
    • Replace the <storage class> with either the storage class that MongoDB deployment uses, or with any storage class that has Retain ReclaimPolicy.

      By default, the mongodb-backup-deployment.yaml file needs to replace both instances of <mongo namespace> with the namespace where the MongoDB deployment that foundational services uses is running.

  2. Add the PVC to the cluster.

     oc apply -f mongodb-backup-pvc.yaml
    
  3. Add the deployment to the cluster.

     oc apply -f mongodb-backup-deployment.yaml
    

Back up common-service-db (CS v4.6 and newer)

  1. Get the common-service-db backup resources.

     wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts-adopter/velero/schedule/common-service-db/cs-db-backup-deployment.yaml
     wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts-adopter/velero/schedule/common-service-db/cs-db-backup-pvc.yaml
     wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts-adopter/velero/schedule/common-service-db/cs-db-br-script-cm.yaml
     wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts-adopter/velero/schedule/common-service-db/cs-db-role.yaml
     wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts-adopter/velero/schedule/common-service-db/cs-db-rolebinding.yaml
     wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts-adopter/velero/schedule/common-service-db/cs-db-sa.yaml
    
  2. Update the backup files.

    • Replace <cs-db namespace> with the namespace where common-service-db instance is running.
    • Replace the <storage class> with the storage class that the current IM deployment uses.
  3. Add the PVC to the cluster.

     oc apply -f cs-db-backup-pvc.yaml
    
  4. Add the cs-db-br-script-cm.yaml to the correct namespace

     oc apply -f cs-db-br-script-cm.yaml
    
  5. Give the common-service-db backup necessary permissions

     oc apply -f cs-db-sa.yaml
    
     oc apply -f cs-db-role.yaml
    
     oc apply -f cs-db-rolebinding.yaml
    
  6. Add the deployment to the cluster.

     oc apply -f cs-db-backup-deployment.yaml
    

Back up Zen v4 (CS v3.23.x or 3.19.x)

Note: The following instructions must be used only for Zen instances that use foundational services version v3.23.x or 3.19.x.

Back up Zen v4 Data (CS 3.23.x/3.19.x)

The procedure is applicable for Zen deployments with Zen v4 or earlier. If you use Zen v5 or later, see Back up Zen v5 (CS v4.x). This may differ between zenservice instances if multiple are present on the same cluster.

  1. Get the necessary files for zen backup:

     wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen-backup-pvc.yaml
     wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen-backup-deployment.yaml
     wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen4-br-scripts.yaml
     wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen4-sa.yaml
     wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen4-rolebinding.yaml
     wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen4-role.yaml
    
  2. Update the backup files.

    By default, zen-backup-pvc.yaml needs to replace the following parameters:

    • Replace <zenservice namespace> with the namespace where the zenservice instance is running.
    • Replace <storage class> with the storage class of the common service db or MongoDB deployment or the storage class with Retain ReclaimPolicy.

      In the zen-backup-deployment.yaml file, replace all instances of <zenservice namespace> with the namespace where the zenservice instance is deployed. There are two parameters for the velero backup and restore commands.

      By default, the backup and restore commands such as .spec.template.metadata.annotations.pre.hook.backup.velero.io/command and .spec.template.metadata.annotations.post.hook.restore.velero.io/command are scheduled to run in the <zenservice namespace> namespace as parameters to the scripts called in the commands. Edit the first parameter values to match the namespace where the Zenservice is deployed.

      By default, the restore command such as .spec.template.metadata.annotations.post.hook.restore.velero.io/command is set to run against zenservice named <zenservice name>. Update the second parameter to match the name of the zenservice in the target namespace.

      In zen4-br-scripts.yaml and zen4-sa.yaml, make sure to replace <zenservice namespace> with the namespace where each instance of zenservice is deployed.

  3. Add the PVC to the cluster.

     oc apply -f zen-backup-pvc.yaml
    
  4. Add zen4-br-scripts.yaml to the correct namespace.

     oc apply -f zen4-br-scripts.yaml
    
  5. Give the necessary permissions for the Zen 4 backup.

    • To backup each namespace with zenservice, create a service account. Replace the <zenservice namespace> with the namespace where you deployed the zenservice.

        oc apply -f zen4-sa.yaml
      
    • Apply the Role for the zen backup for each zenservice namespace. Replace <zenservice namespace> with the namespace where you deployed the zenservice.

       oc apply -f zen4-role.yaml
      
    • Create the RoleBinding to connect the ServiceAccount to the Role.

      1. Edit the zen4-rolebinding.yaml file to add the ServiceAccount created earlier and replace <zenservice namespace> with the namespace where you deployed the zenservice.

        ``cmd vi zen4-rolebinding.yaml ```

    • Apply the zen4-rolebinding.yaml file.

       oc apply -f zen4-rolebinding.yaml
      
  6. Add the deployment to the cluster.

     oc apply -f zen-backup-deployment.yaml
    
  7. Repeat the previous steps for each namespace with a zenservice instance

Back up Zen v5 (CS v4.x)

Note: The following instructions must only be used for Zen instances that use foundational services version v4.x.

  1. Locate zenservice instances.

     oc get zenservice -A
    
  2. Label each zenservice.

     oc label zenservice <zenservice name> foundationservices.cloudpak.ibm.com=zen --overwrite=true -n <namespace>
    

Back up Zen MetastoreDB v5

Note: Repeat this step for each namespace where a zenservice instance is installed.

  1. Get the Zen 5 backup resource.

     wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-backup-deployment.yaml
     wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-backup-pvc.yaml
     wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-br-scripts-cm.yaml
     wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-sa.yaml
     wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-role.yaml
     wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-rolebinding.yaml
    
  2. Update the backup files.

    By default, the zen5-backup-pvc.yaml needs to replace the following parameters:

    • Replace <zenservice namespace> with the namespace where the zenservice instance is running.
    • Replace the <storage class> with either the storage class that common service db or MongoDB deployment uses, or with any storage class that has Retain ReclaimPolicy.

      In the zen5-backup-deployment.yaml file, replace all instances of <zenservice namespace> with the namespace where the zenservice instance is running. There are four; two are parameters for the velero backup and restore commands.

      By default, the backup and restore commands (represented by .spec.template.metadata.annotations.pre.hook.backup.velero.io/command & .spec.template.metadata.annotations.post.hook.restore.velero.io/command) are scheduled to run in the <zenservice namespace> namespace as parameters to the scripts called in the commands. Edit both commands' first parameter values to match the namespace that the deployment is created in.

      By default, the restore command (represented by .spec.template.metadata.annotations.post.hook.restore.velero.io/command) is set to run against zenservice named <zenservice name>. Update the second parameter to match the name of the zenservice in the target namespace.

      In the zen5-br-scripts-cm.yaml and zen5-sa.yaml, make sure to replace the namespace value <zenservice namespace> with the zenservice namespace in use for each instance of zenservice in use.

  3. Add the PVC to the cluster.

     oc apply -f zen5-backup-pvc.yaml
    
  4. Add the zen5-br-scripts-cm.yaml to the correct namespace

     oc apply -f zen5-br-scripts-cm.yaml
    
  5. Give the Zen 5 backup necessary permissions

    • For each namespace with a zenservice to backup, create a service account. Replace the <zenservice namespace> value before applying.

        oc apply -f zen5-sa.yaml
      
    • Once per zenservice namespace, apply the Role for the zen backup. Replace the <zenservice namespace> value before applying.

        oc apply -f zen5-role.yaml
      
    • Create the RoleBinding to connect the ServiceAccount to the Role.

      1. Edit the zen5-rolebinding.yaml file to add the ServiceAccount created earlier and replace the <zenservice namespace> value.

         vi zen5-rolebinding.yaml
        
      2. Apply the zen5-rolebinding.yaml file

         oc apply -f zen5-rolebinding.yaml
        
  6. Add the deployment to the cluster.

     oc apply -f zen5-backup-deployment.yaml
    

Create a backup resource

Create a backup resource for the velero namespace.

  1. Get the schedule-common-services.yaml file.

     wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/schedule-common-services.yaml
    
  2. Update the schedule-common-services.yaml file based on your backup requirements. For more information, see Velero Schedule API Type Opens in a new tab. By default, the backup runs once a day and is deleted 48 hours later.

    The following configurations in the schedule-common-services.yaml file are important:

    • schedule:, which is a CRON expression. CRON uses the server time, which is usually the Coordinated Universal Time unless configured to be something else.
    • ttl, which is the time to live for the backup.
    • storageLocation, which is the same storage location that you used when you set up OADP. The command oc get backupstoragelocations.velero.io -n <velero namespace> can be used to get the name.
    • velero, which is the namespace where you installed OADP.
  3. Create the resource.

     oc apply -f schedule-common-services.yaml
    
  4. Verify whether the backup schedule was created.

     velero schedule get
    

    After the first scheduled time passes, you can verify whether the backup ran. Look for a schedule name and timestamp.

     velero backup get
    
  5. Verify whether the backup was successful and check the details to see if all resources are saved.

     velero backup describe <__BACKUP_NAME__> --details
    

Restoring foundational services

Complete the following steps to restore foundational services.

Before you restore foundational services, set up Velero on the new cluster. Follow the instructions in the Create the backup resources section.

For troubleshooting issues that may arise during restore, see IBM Cloud Pak foundational services Installation Troubleshooting.

Download the necessary files for restoring different resources:

wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-namespace.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-entitlementkey.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-pull-secret.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-catalog.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-operatorgroup.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-configmap.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-crd.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-commonservice.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-subscriptions.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-licensing.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-cert-manager.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-operands.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-mongo-data.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-cs-db.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-zen5-data.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-zen-data.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-singleton-subscriptions.yaml
  1. Restore the foundational services namespaces by using the restore-namespace.yaml file.

    1. Get the name of the Velero backup that you plan to use for restoring.

       velero backup get
      

      Replace __BACKUP_NAME__ in the following commands with the Velero backup name.

      Verify whether the backup was successful and check the details to see if all resources are saved.

       velero backup describe <__BACKUP_NAME__> --details
      
    2. Substitute the __BACKUP_NAME__ with the name of the backup resource that you created in a previous step.

       vi restore-namespace.yaml
      
    3. Restore the namespaces.

       oc apply -f restore-namespace.yaml
      

      You can check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as Completed.

       velero restore get
      
       velero restore describe <__RESTORE_NAME__> --details
      
    4. Verify whether the namespaces are restored. Your namespaces must be listed in the command output.

       oc get namespace
      

      Proceed with the next step after the namespaces are restored.

    5. Change the default project to the restored common service namespace.

       oc project <namespace-where-foundational services-are-installed>
      
  2. Restore the entitlement key.

    1. Substitute the __BACKUP_NAME__ with the name of the backup resource that you created in a previous step.

      vi restore-entitlementkey.yaml
      
    2. Restore the entitlement key.

      oc apply -f restore-entitlementkey.yaml
      
    3. Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as Completed.

       velero restore get
      
       velero restore describe <__RESTORE_NAME__> --details
      
    4. Verify whether the entitlement key is restored.

       oc get secret
      
  3. Restore the pull secret.

    1. Save the current pull secret.

       oc get secret pull-secret -n openshift-config -o yaml > original-pull-secret.yaml
      
    2. Delete the current pull secret from the openshift-config namespace.

      oc delete secret pull-secret -n openshift-config
      
    3. Substitute the __BACKUP_NAME__ with the name of the backup resource that you created in a previous step.

      vi restore-pull-secret.yaml
      
    4. Restore the pull secret.

       oc apply -f restore-pull-secret.yaml
      
    5. Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as Completed.

       velero restore get
      
       velero restore describe <__RESTORE_NAME__> --details
      
    6. Verify whether the pull secret is restored.

       oc get secret -n openshift-config | grep pull
      
  4. Restore the catalog.

    1. Substitute the __BACKUP_NAME__ with the name of the backup resource that you created in a previous step.

      vi restore-catalog.yaml
      
    2. Restore the catalog.

      oc apply -f restore-catalog.yaml
      
    3. Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as Completed.

       velero restore get
      
       velero restore describe <__RESTORE_NAME__> --details
      
    4. Verify whether the catalog source is restored.

       oc get catalogsource -n openshift-marketplace | grep ibm
      
    5. Verify whether the ibm-operator-catalog pod is running.

       oc get pod -n openshift-marketplace -w
      

      Note: If using IBM Cert Manager, IBM Licensing, or cloud-native-postgresql-catalog catalog source, verify that their pods are Running as well.

      If the pods are running, proceed with the next step.

  5. Restore the operator groups.

    1. Substitute the __BACKUP_NAME__ with the name of the backup resource that you created in a previous step.

      vi restore-operatorgroup.yaml
      
    2. Restore the operator groups.

      oc apply -f restore-operatorgroup.yaml
      
    3. Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as Completed.

       velero restore get
      
       velero restore describe <__RESTORE_NAME__> --details
      
    4. Verify whether the operator groups are restored.

       oc get operatorgroup -A
      
  6. Restore the common-service-maps configmap.

    1. Substitute the __BACKUP_NAME__ with the name of the backup resource that you created in a previous step.

      vi restore-configmap.yaml
      
    2. Restore the configmap.

      oc apply -f restore-configmap.yaml
      
    3. Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as Completed.

       velero restore get
      
       velero restore describe <__RESTORE_NAME__> --details
      
    4. Verify whether the configmap is restored.

       oc get configmap common-service-maps -n kube-public
      
  7. Restore the commonservices.operator.ibm.com customresourcedefinition (CRD).

    1. Substitute the __BACKUP_NAME__ with the name of the backup resource that you created in a previous step.

      vi restore-crd.yaml
      
    2. Restore the CRD.

      oc apply -f restore-crd.yaml
      
    3. Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as Completed.

       velero restore get
      
       velero restore describe <__RESTORE_NAME__> --details
      
    4. Verify whether the crd is restored.

       oc get customresourcedefinition | grep commonservices.operator.ibm.com
      
  8. Restore the CommonService CRs.

    1. Substitute the __BACKUP_NAME__ with the name of the backup resource that you created in a previous step.

       vi restore-commonservice.yaml
      
    2. Restore the CRs.

       oc apply -f restore-commonservice.yaml
      
    3. Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as Completed.

       velero restore get
      
       velero restore describe <__RESTORE_NAME__> --details
      
    4. Verify whether the CommonService CRs are restored.

       oc get commonservice -A
      

      If a CommonService CR is not restored, delete the restore resource and apply it again:

    5. Delete the resource.

       ```cmd
       oc delete -f restore-commonservice.yaml
       ```
      
    6. Restore the CR.

       ```cmd
       oc apply -f restore-commonservice.yaml
       ```
      

      Wait for 30 seconds and check again for the CommonService resource.

  9. Restore the singleton subscriptions.

    1. Substitute the __BACKUP_NAME__ with the name of the backup resource that you created in a previous step.

       vi restore-singleton-subscriptions.yaml
      
    2. Restore the subscriptions.

       oc apply -f restore-singleton-subscriptions.yaml
      
    3. Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as Completed.

       velero restore get
      
       velero restore describe <__RESTORE_NAME__> --details
      
    4. Watch the namespaces where Cert Manager and License Service are deployed for the Cert Manager and Licensing operators to be running. By default Cert Manager and License Service are deployed in ibm-cert-manager and ibm-licensing namespaces.

       oc get pod -n <cs namespace> -w
      
  10. Restore cert manager resources.

    1. Substitute the __BACKUP_NAME__ with the name of the backup resource that you created in a previous step.

       vi restore-cert-manager.yaml
      
    2. Restore the cert manager resource.

       oc apply -f restore-cert-manager.yaml
      
    3. Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as Completed.

       velero restore get
      
       velero restore describe <__RESTORE_NAME__> --details
      
    4. Verify whether the certificates are restored.

       oc get certificates
      
  11. Restore the subscriptions.

    1. Substitute the __BACKUP_NAME__ with the name of the backup resource that you created in a previous step.

      vi restore-subscriptions.yaml
      
    2. Restore the subscriptions.

      oc apply -f restore-subscriptions.yaml
      
    3. Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as Completed.

       velero restore get
      
       velero restore describe <__RESTORE_NAME__> --details
      
    4. Watch the foundational services namespace for the operand-deployment-lifecycle-manager to be running:

       oc get pod -n <cs namespace> -w
      

    See the following notes:

    • If not using IBM Cert Manager, IBM Common Service Operator deployment fails unless a third-party Cert Manager is installed on the cluster beforehand.
    • If using SOD, it is likely the ibm-common-service-operator will not come ready after restoring the subscriptions and subsequently will not deploy ODLM. This is expected and will resolve after running the next step.

    Troubleshooting: In case of issues with generating new installation plans for updates or new installations, see OLM is unable to generate new install plans.

  12. Run setup_tenant.sh to setup cluster topology.

    1. Get the setup.tenant.sh and utils.sh scripts by running the following command:

       wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts-adopter/cp3pt0-deployment/setup_tenant.sh
       mkdir common && cd common
       wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts-adopter/cp3pt0-deployment/common/utils.sh
      

      Note: This script needs to be run for each instance of foundational services. Each instance should have different namespace values from each other instance (that is, no namespace should be used in two different executions).

    2. Run the following command the make the scricpts executable:

       chmod +x setup_tenant.sh
       chmod +x common/utils.sh
      
    3. Gather the values to run script. Operator and Services namespaces:

       oc get commonservice common-service -o yaml
      

      Locate values .spec.operatorNamespace and .spec.servicesNamespace

      Note: These values will always match unless using SOD.

      Size:

       oc get commonservice common-service -o yaml
      

      Locate value .spec.size

      Tethered namespaces:

       oc get cm common-service-maps -o yaml -n kube-public
      

      Make note of the namespaces under requested-from-namespace.

    4. Run the script.

      Note: If services and operator namespace are the same, you must still specify both parameters when running setup_tenant.sh. In this case, use the same namespace for each parameter. Optional parameters -s and -n can be used if either using a different catalog source than opencloud-operators or if the catalog source is in a different namespace respectively. If everything is deployed to the same namespace (CS operators, CS operands, and Cloud Pak workload), you do not need to use the setup_tenant.sh script and can move on to the next step.

       ./setup_tenant.sh --operator-namespace <operator namespace> --services-namespace <services namespace> --tethered-namespaces <comma delimited (no spaces) list of Cloud Pak workload namespaces that use this foundational services instance> --license-accept -c v<foundational services version number in use i.e. 4.0, 4.1, 4.2, etc> -p <.spec.size value from `CommonService` cr> -i <install mode, either Manual or Automatic>
      
    5. Wait for script to complete successfully. For more information, see Installing foundational services by using a script.

  13. Restore Licensing service configmap.

    1. Substitute the __BACKUP_NAME__ with the name of the backup resource that you created in a previous step.

       vi restore-licensing.yaml
      
    2. Restore the configmap.

       oc apply -f restore-licensing.yaml
      
    3. Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as Completed.

       velero restore get
      
       velero restore describe <__RESTORE_NAME__> --details
      
    4. Verify whether the configmap is restored.

       oc get configmap | grep licensing
      
  14. If you use IBM License Service Reporter, see Backing up the License Service Reporter instance.

  15. Restore the OperandRequests and OperandConfigs.

    1. If you are restoring an OperandConfig, delete the existing first.

       oc delete operandconfig common-service -n <namespace where operandconfig resides>
      
    2. Substitute the __BACKUP_NAME__ with the name of the backup resource that you created in a previous step.

       vi restore-operands.yaml
      
    3. Restore the operands.

       oc apply -f restore-operands.yaml
      
    4. Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as Completed.

       velero restore get
      
       velero restore describe <__RESTORE_NAME__> --details
      
    5. Verify whether the operands are restored.

       oc get operandrequest
      
       oc get operandconfig
      
    6. Verify whether operand requests are reconciled.

      Note: Give ODLM time to reconcile one or more restored operand requests but new operators and their operands should deploy shortly after the restore completes. Check the status fields of the operand requests and the ODLM logs for any issues.

    7. If using a custom hostname, TLS secret, or both, wait for the platform-identity pods to come ready:

      • Verify that the cs-onprem-tenant-config configmap is present:

          oc get cm -n <namespace where hostname is changed or custom TLS secret used> | grep cs-onprem-tenant-config
        
      • Wait for the platform-identity-management, platform-identity-provider, and platform-auth-service pods to come ready in the same namespace.

      • Make sure to update the custom hostname to reflect a change in cluster if necessary. For example, the structure of the route is <route name>.cluster1.com. If you are no longer on cluster1 but now on cluster2, the route needs to be updated from <route name>.cluster1.com to <route name>.cluster2.com.
      • If using a custom TLS secret, it is best to re-create this secret on the new cluster by using the same name. In this case, if the secret was carried over to the new cluster, it would need to be replaced.
      • Follow the instructions here https://www.ibm.com/docs/en/cloud-paks/foundational-services/4.3?topic=cc-updating-custom-hostname-tls-secret-by-using-configmap.
  16. Restore MongoDB. (CS v3.19.x to 4.5.x)

    1. Substitute the __BACKUP_NAME__ with the name of the backup resource that you created in a previous step.

       vi restore-mongo-data.yaml
      
    2. Restore the MongoDB data.

       oc apply -f restore-mongo-data.yaml
      
    3. Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as Completed.

       velero restore get
      
       velero restore describe <__RESTORE_NAME__> --details
      
    4. Verify that the restore completed successfully.

      Check the logs for the velero restore to ensure that the restore went through. Search for the following log: "Failed: error connecting to db server: no reachable servers"

      If this message is present, follow these instructions:

      1. Get the mongo-restore.sh file.

         wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/mongoDB/mongo-restore.sh
        
      2. Make the restore script executable:

         chmod +x mongo-restore.sh
        
      3. Delete the existing mongodb-backup deployment in the failed restore namespace:

         oc delete deploy mongodb-backup -n <failed restore namespace>
        
      4. Run the script:

         ./mongo-restore.sh <cs namespace>
        

        Troubleshooting: If the mongodb-restore pod is stuck in ContainerCreating:

        • delete the deployment mongodb-backup-deployment
        • make sure the mongodb-backup pod is fully deleted (not Terminating)
        • delete the mongodb-restore job and its pod (not Terminating)
        • rerun the mongo-restore.sh script

      Note: The secondary steps that are listed here must be run only if the restore logs indicate that the restore was not run. When restoring multiple namespaces, it is possible that some succeed and some fail. The namespace should be specified for each one pass or fail but there is no harm in running these secondary steps after a successful restore. If the storage class used on the backup cluster does not match the storage class that is used on the target cluster, the restore fails. Adapting to different storage classes across clusters is a current limitation of velero.

  17. Restore common-service-db (CS v4.6.x and newer)

    1. Get the restore object

      wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-cs-db.yaml
      
    2. Substitute the __BACKUP_NAME__ with the name of the backup resource that you created previously.

      vi restore-cs-db.yaml
      
    3. Restore the cs-db data.

      oc apply -f restore-cs-db.yaml
      
    4. Check restore progress. Proceed with the next step after restore is complete.

      velero restore get
      
    5. Check logs of the velero restore to verify that the data was restored

      velero restore logs restore-zen5-data
      

    Troubleshooting: If the logs or the data indicate that the restore was not successful, apply the following workaround:

    1. Get the restore job:

        wget https://raw.githubusercontent.com/qpdpQ/ibm-common-service-operator/IM-ZEN-workaround/velero/restore/common-service-db/cs-db-restore-job.yaml
      
    2. Replace <cs-db namespace> with the namespace where common-service-db instance is running.

    3. Delete the existing cs-db-backup deployment and cs-db-backup pod.

        oc delete deploy cs-db-backup -n <namespace>
      
    4. Run the restore job.

        oc apply -f cs-db-restore-job.yaml
      
  18. Restore Platform UI (Zen) resources.

    Note: If Zen is installed in a namespace other than the namespace where foundational services are installed, then create that namespace first (if not already restored).

    oc new-project <namespace-where-Zen-is-installed>
    
    1. Substitute the __BACKUP_NAME__ with the name of the backup resource that you created in a previous step.

       vi restore-zen.yaml
      
    2. Restore the zenservice instances.

       oc apply -f restore-zen.yaml
      
    3. Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as Completed.

       velero restore get
      
       velero restore describe <__RESTORE_NAME__> --details
      
    4. Wait for the zenservice instances to come ready. Once the Progress field is 100%, the instance is ready. The following command will continuously output the percentage of all the zenservices on the cluster.

       oc get zenservice -A -w -o yaml | grep Progress:
      

    Note: If the restored zenservice contains fields to configure zenCustomRoute, do the following:

    • Verify the secret used (if the field exists) is present in the zenservice namespace in the target cluster.
    • Update the value in the zenservice CR for the route. For example, the structure of the route is <route name>.cluster1.com. If you are no longer on cluster1 but now on cluster2, the route needs to be updated from <route name>.cluster1.com to <route name>.cluster2.com.
  1. Restore Zen data for Zen v4 (CS 3.23.x/3.19.x)

    The step is applicable for Zen deployments with Zen v4 or earlier. If you use Zen v5 or later, proceed with next step. This may differ between zenservice instances if multiple are present on the same cluster.

    1. Replace __BACKUP_NAME__ with the name of the backup resource that you created in the previous step.

       vi restore-zen-data.yaml
      
    2. Restore the Zen data.

       oc apply -f restore-zen-data.yaml
      
    3. Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as Completed.

       velero restore get
      
       velero restore describe <__RESTORE_NAME__> --details
      
    4. Check the logs of the velero restore to verify that the data was restored.

       velero restore logs restore-zen-data
      
      • Search for zen4-br.sh to find the relevant logs. If zen4-br.sh is not available in the logs, the restore process is not completed. If the logs or the data indicate that the restore was not successful, complete the following steps:

        1. Get the Zen restore job resource.

           wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/zen/zen-restore-job.yaml
          
        2. Delete the existing zen-backup deployment.

           oc delete deploy zen-backup -n <namespace>
          
        3. Wait untill the zen-backup pods are fully deleted. (fully gone, not Terminating)

        4. Give the Zen backup necessary permissions if the necessary ServiceAccount, Role, and RoleBinding are not available for the Zen backup.

          • Check if permissions exist:

              oc get sa -n <zenservice namespace> | grep zen4
              oc get role -n <zenservice namespace> | grep zen4
              oc get rolebinding -n <zenservice namespace> | grep zen4
            
          • Get the zen4-sa.yaml, zen4-role.yaml, and zen4-rolebinding.yaml files.

              wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen4-sa.yaml
              wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen4-role.yaml
              wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen4-rolebinding.yaml
            
          • For each namespace with a zenservice to restore, edit the service account file zen4-sa.yaml to deploy in the corresponding namespace.

              oc apply -f zen4-sa.yaml
            
          • Apply the zen4-role.yaml file for each zenservice namespace to create the Role for the zen backup. Replace <zenservice namespace> with the namespace where you deployed the zenservice.

              oc apply -f zen4-role.yaml
            
          • Create the RoleBinding to connect the ServiceAccounts to the Role.

            1. Edit the zen4-rolebinding.yaml file to add each ServiceAccount created earlier. Replace the <zenservice namespace> with the namespace where you deployed the zenservice.

               vi zen4-rolebinding.yaml
              
            2. Apply the zen4-rolebinding.yaml file

               oc apply -f zen4-rolebinding.yaml
              
        5. Edit the zen-restore-job.yaml file. Replace the <zenservice namespace> parameter for the underlying zen4-br.sh to reflect the proper zenservice namespace.

        6. Apply the zen-restore-job.yaml file

           oc apply -f zen-restore-job.yaml
          
        7. Wait for the job to complete and check the logs of the zen4-restore-job pod to verify that restore is completed.

        8. Repeat the procedure for each namespace with a zenservice instance installed.

    5. Wait for the zenservice instances to become ready. The instance is ready if the Progress field is 100%. The following command provides the output percentage of all the zenservices on the cluster.

       oc get zenservice -A -w -o yaml | grep Progress:
      

    Troubleshooting:

    • Make sure that there is only one zen-backup or zen4-restore-job pod in a namespace at any given time as they compete for the same PVC.
    • If the zen4-restore-job pod is stuck in ContainerCreating:
      1. Delete the deployment zen-backup
      2. Make sure the zen-backup pod is fully deleted (not Terminating)
      3. Delete the zen4-restore-job job and its pod (not Terminating)
      4. Ensure that the zen4-br-configmap configmap, zen-backup-pvc pvc, zen4-backup-role role, zen4-backup-rolebinding rolebinding, and zen4-backup-sa service account are present in the namespace
      5. Reapply the zen4-restore-job yaml
    • If the configmap zen4-br-configmap is not present, you can downloaded with the following command:

        wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen4-br-scripts.yaml.
      

      Make sure to edit the namespace field before applying with the following command:

        oc apply -f zen4-br-scripts.yaml
      
    • Velero restore is less predictable than backup when restoring databases. There is no harm to deleting a velero restore object (that is, restore-cs-db-data, restore-mongo-data or restore-zen-data), deleting the accompanying deployment and pvc, waiting for these items to be fully deleted, then re-creating the velero restore object to try again. Should this still not work, the secondary instructions that use the cs-db-restore-job.yaml or mongo script and zen4 restore job can be used on an individual namespace basis. There is no harm to running the restore in a namespace that has already been restored.

  2. Restore Zen data v5.

    1. Substitute the __BACKUP_NAME__ with the name of the backup resource that you created in a previous step.

       vi restore-zen5-data.yaml
      
    2. Restore the Zen data.

       oc apply -f restore-zen5-data.yaml
      
    3. Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as Completed.

       velero restore get
      
       velero restore describe <__RESTORE_NAME__> --details
      
    4. Check logs of the velero restore to verify that the data was restored

       velero restore logs restore-zen5-data
      
      • Search for restore_zen5 to find relevant logs. If it is not present, the restore did not run. If the logs or the data indicate that the restore was not successful, the following steps can be taken as a workaround:

        1. Get the Zen 5 restore job resource.

           wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/zen/zen5-restore-job.yaml
          
        2. Delete the existing zen5-backup deployment

           oc delete deploy zen5-backup -n <namespace>
          
        3. Wait for the zen5-backup pods to fully delete (fully gone, not Terminating)

        4. Give the zen5 backup necessary permissions if the necessary ServiceAccount, Role, and RoleBinding are not already present.

          • Check if permissions exist:

              oc get sa -n <zenservice namespace> | grep zen5
              oc get role | grep zen5
              oc get rolebinding | grep zen5
            
          • Get the zen5-sa.yaml, zen5-role.yaml, & zen5-rolebinding.yaml files.

              wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-sa.yaml
              wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-role.yaml
              wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-rolebinding.yaml
            
          • For each namespace with a zenservice to backup, edit the service account file zen5-sa.yaml to deploy in the corresponding namespace

              oc apply -f zen5-sa.yaml
            
          • Once per zenservice namespace, apply the zen5-role.yaml file to create the Role for the zen backup. Replace the <zenservice namespace> value before applying.

              oc apply -f zen5-role.yaml
            
          • Create the RoleBinding to connect the ServiceAccounts to the Role.

            1. Edit the zen5-rolebinding.yaml file to add each ServiceAccount created earlier. Replace the <zenservice namespace> value before applying.

               vi zen5-rolebinding.yaml
              
            2. Apply the zen5-rolebinding.yaml file

               oc apply -f zen5-rolebinding.yaml
              
        5. Edit the zen5-restore-job.yaml file. The default namespace is set to zen. The parameters for the underlying restore_zen5.sh are defaulted to the zen namespace and test-zen zenservice name. Update both of these parameters to reflect the proper namespace and zenservice respectively.

        6. Apply the zen5-restore-job.yaml file

           oc apply -f zen5-restore-job.yaml
          
        7. Wait for the job to complete, then check the logs of the zen5-restore-job pod to verify restore completed.

        8. Repeat as needed for each namespace with a zenservice instance installed.

    5. Wait for the zenservice instances to come ready. Once the Progress field is 100%, the instance is ready. The following command will continuously output the percentage of all the zenservices on the cluster.

       oc get zenservice -A -w -o yaml | grep Progress:
      

    Troubleshooting:

    • Make sure that there is only one zen5-backup or one zen5-restore-job pod in a namespace at any given time as they compete for the same PVC.
    • If the zen5-restore-job pod is stuck in ContainerCreating:
      1. delete the deployment zen5-backup
      2. make sure the zen5-backup pod is fully deleted (not Terminating)
      3. delete the zen5-restore-job job and its pod (not Terminating)
      4. ensure that the configmap zen5-br-configmap, pvc zen5-backup-pvc, role zen5-backup-role, rolebinding zen5-backup-rolebinding, and service account zen5-backup-sa are present in the namespace
      5. reapply the zen5-restore-job yaml
    • If the configmap zen5-br-configmap is not present, it can be downloaded from:

        wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-br-scripts-cm.yaml.
      

      Make sure to edit the namespace field before applying with the following command:

        oc apply -f zen5-br-scripts-cm.yaml
      
    • Velero restore is less predictable than backup when restoring databases. There is no harm to deleting a velero restore object (that is, restore-cs-db-data, restore-mongo-data or restore-zen5-data), deleting the accompanying deployment and pvc, waiting for these items to be fully deleted, then re-creating the velero restore object to try again. Should this still not work, the secondary instructions by using the cs-db-restore-job.yaml or mongo script and zen5 restore job can be used on an individual namespace basis. There is no harm to running the restore in a namespace that has already been restored.

  3. If you use a custom route for the restored zenservice and you are restoring to a new cluster, update the value of zenCustomRoute in the zenservice CR to reflect the new hostname and re-trigger the iam-config job. Run the following commands:

    oc -n <zenservice namespace> patch zenservice  <zenservice name>  --type='merge' -p '{"spec":{"zenCustomRoute":{"route_host":"<updated route>"}}}'
    oc -n  <zenservice namespace> patch zenservice <zenservice name>  --type='merge' -p '{"spec":{"reconcile":true}}'
    oc get job -n  <zenservice namespace> iam-config-job -o json | jq 'del(.spec.selector)' | jq 'del(.spec.template.metadata.labels)' | oc replace --force -f -
    

All restoration tasks are completed.

Verify whether foundational services are properly restored.

For more information about backing up and restoring Identity Management (IM) components, see Identity management backup and restore.

For migrating existing OIDC and SAML configurations, see Migrating identity management.

General Troubleshooting:

If a restore process is stopped in the New phase when you view with velero restore get, restart the velero pod in the namespace where OADP is installed. After the velero pod restarts, the status of the restore process must change to InProgress.