IBM Cloud Pak foundational services backup and restore for multiple instances of foundational services

You can schedule backup and restore of foundational services by using the Red Hat OpenShift API for Data Protection (OADP) operator. Make sure that you use the stable-1.3 channel of the OADP operator.

Prerequisites

Note: If the cluster being backed up or restored to uses s390x for architecture, any velero CLI commands must be run on an alternate cluster that does not use s390x and has oc access to the original (usually using oc login). Velero CLI does not yet support s390x.

Backing up foundational services

Complete the following steps to back up the installed foundational services.

Create the backup resources

You need the following resources for completing the backup procedures.

  1. Log in to your OpenShift cluster command-line interface (CLI) by using the oc login command.

  2. Create a namespace for Velero objects. The following example creates the velero namespace. For more information about Velero, see Velero documentation Opens in a new tab.

     oc project velero
    
  3. Install the Red Hat OADP operator in the velero namespace. For more information, see About installing OADP Opens in a new tab.

  4. Create a secret named cloud-credentials with the access key id and secret access key credentials.

    1. Open any editor and place the following credentials in a file named credentials-velero.

       vi credentials-velero
      
    2. Insert the following content in the file:

       [default]
       aws_access_key_id=<access_key_id>
       aws_secret_access_key=<secret_access_key>
      
    3. Create the secret.

      oc create secret generic cloud-credentials -n velero --from-file cloud=credentials-velero
      
  5. From your OpenShift cluster console OperatorHub page, install the OADP operator from the stable-1.3 channel, which provides the Velero 1.9 API. The API is needed for foundational services backup and restore. For more information, see OpenShift Container Platform documentation Opens in a new tab.

  6. Create a DataProtectionApplication object.

    Note: The provider is aws even if you are not using AWS Object Storage.

     apiVersion: oadp.openshift.io/v1alpha1
     kind: DataProtectionApplication
     metadata:
       name: <resource_name>
       namespace: velero
       annotations:
         argocd.argoproj.io/sync-options: SkipDryRunOnMissingResource=true
         argocd.argoproj.io/sync-wave: '20'
     spec:
       backupLocations:
         - velero:
             config:
               profile: default
               region: <bucket_region>
               s3ForcePathStyle: 'true'
               s3Url: <s3_URL>
             credential:
               key: cloud
               name: cloud-credentials
             default: true
             objectStorage:
               bucket: <bucket_name>
               prefix: <root_directory_name>
             provider: aws
       configuration:
         restic:
           enable: true
         velero:
           defaultPlugins:
             - openshift
             - aws
           podConfig:
             resourceAllocations:
               limits:
                 cpu: '1'
                 memory: 1Gi
               requests:
                 cpu: 500m
                 memory: 512Mi
    

Add labels to resources

You can add labels to resources automatically by running the script or by manually adding labels. Complete one of the following procedures.

Labelling the resources automatically by running the scrip

  1. Run the following commands to fetch and download the env.properties file and the label-common-services.sh script and save them in the same folder.

     wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/backup/common-service/label-common-service.sh
     wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/backup/common-service/env.properties
    
  2. Open the env.properties file, edit the required variables and save the changes.

    Note: The OPERATOR_NS="" variable must be properly set for the script to work. Other variables have default values. You can change these values to fit your environment.

     vi env.properties
    

    The env.properties file contains the following variables:

     # Change the following values to fit your environment.
     OPERATOR_NS="" # Set the parameter to the namespace where the foundational services operator is installed.
    
     # Pass the namespace where foundational services are installed.
     # Leave the value of this pameter empty if the services are installed in the same namespace as the foundational services operator.
     SERVICES_NS="" 
    
     # Pass the control namespace if it is needed to be backed up.
     CONTROL_NS=""
    
     # Change to the namespace where cert-manager, License Service and License Service Reporter are installed if they are istalled in custom namespaces.
     CERT_MANAGER_NAMESPACE="ibm-cert-manager" 
     LICENSING_NAMESPACE="ibm-licensing"
     LSR_NAMESPACE="ibm-lsr"
    
     # Change to 1 to enable the private catalog if required.
     ENABLE_PRIVATE_CATALOG=0
    
     # Add additional CatalogSources without the ".spec.publisher: IBM" parameter. Separate the CatalogSources with a comma.
     # For example: "my-catalog,my-catalog2,my-catalog3"
     ADDITIONAL_SOURCES=""
    
  3. Use the following command to run the label-common-service.sh script.

     ./label-common-service.sh
    

Manually adding labels to resources

Before you begin, set the namespace where you installed foundational services as the default namespace.

oc project <namespace-where-foundational services-are-installed>

You must label the resources that are currently installed to identify them during restoration.

Backup common-service-db

Note: This step applies for foundational services version 4.6 and newer.

  1. Get the common-service-db backup resources.

     wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts-adopter/velero/schedule/common-service-db/cs-db-backup-deployment.yaml
     wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts-adopter/velero/schedule/common-service-db/cs-db-backup-pvc.yaml
     wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts-adopter/velero/schedule/common-service-db/cs-db-br-script-cm.yaml
     wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts-adopter/velero/schedule/common-service-db/cs-db-role.yaml
     wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts-adopter/velero/schedule/common-service-db/cs-db-rolebinding.yaml
     wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts-adopter/velero/schedule/common-service-db/cs-db-sa.yaml
    
  2. Update the backup files.

    • Replace <cs-db namespace> with the namespace where common-service-db instance is running.
    • Replace the <storage class> with the storage class that the current IM deployment uses.
  3. Add the PVC to the cluster.

     oc apply -f cs-db-backup-pvc.yaml
    
  4. Add the cs-db-br-script-cm.yaml to the correct namespace

     oc apply -f cs-db-br-script-cm.yaml
    
  5. Give the common-service-db backup necessary permissions

     oc apply -f cs-db-sa.yaml
    
     oc apply -f cs-db-role.yaml
    
     oc apply -f cs-db-rolebinding.yaml
    
  6. Add the deployment to the cluster.

     oc apply -f cs-db-backup-deployment.yaml
    

Back up Zen

  1. Locate zenservice instances.

     oc get zenservice -A
    
  2. Label each zenservice.

     oc label zenservice <zenservice name> foundationservices.cloudpak.ibm.com=zen --overwrite=true -n <namespace>
    

Back up Zen MetastoreDB

Note: Repeat this step for each namespace where a zenservice instance is installed.

  1. Get the Zen 5 backup resource.

     wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-backup-deployment.yaml
     wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-backup-pvc.yaml
     wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-br-scripts-cm.yaml
     wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-sa.yaml
     wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-role.yaml
     wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-rolebinding.yaml
    
  2. Update the backup files.

    By default, the zen5-backup-pvc.yaml needs to replace the following parameters:

    • Replace <zenservice namespace> with the namespace where the zenservice instance is running.
    • Replace the <storage class> with either the storage class that common service db deployment uses, or with any storage class that has Retain ReclaimPolicy.

      In the zen5-backup-deployment.yaml file, replace all instances of <zenservice namespace> with the namespace where the zenservice instance is running. There are four; two are parameters for the velero backup and restore commands.

      By default, the backup and restore commands (represented by .spec.template.metadata.annotations.pre.hook.backup.velero.io/command & .spec.template.metadata.annotations.post.hook.restore.velero.io/command) are scheduled to run in the <zenservice namespace> namespace as parameters to the scripts called in the commands. Edit both commands' first parameter values to match the namespace that the deployment is created in.

      By default, the restore command (represented by .spec.template.metadata.annotations.post.hook.restore.velero.io/command) is set to run against zenservice named <zenservice name>. Update the second parameter to match the name of the zenservice in the target namespace.

      In the zen5-br-scripts-cm.yaml and zen5-sa.yaml, make sure to replace the namespace value <zenservice namespace> with the zenservice namespace in use for each instance of zenservice in use.

  3. Add the PVC to the cluster.

     oc apply -f zen5-backup-pvc.yaml
    
  4. Add the zen5-br-scripts-cm.yaml to the correct namespace

     oc apply -f zen5-br-scripts-cm.yaml
    
  5. Give the Zen 5 backup necessary permissions

    • For each namespace with a zenservice to backup, create a service account. Replace the <zenservice namespace> value before applying.

        oc apply -f zen5-sa.yaml
      
    • Once per zenservice namespace, apply the Role for the zen backup. Replace the <zenservice namespace> value before applying.

        oc apply -f zen5-role.yaml
      
    • Create the RoleBinding to connect the ServiceAccount to the Role.

      1. Edit the zen5-rolebinding.yaml file to add the ServiceAccount created earlier and replace the <zenservice namespace> value.

         vi zen5-rolebinding.yaml
        
      2. Apply the zen5-rolebinding.yaml file

         oc apply -f zen5-rolebinding.yaml
        
  6. Add the deployment to the cluster.

     oc apply -f zen5-backup-deployment.yaml
    

Create a backup resource

Create a backup resource for the velero namespace.

  1. Get the schedule-common-services.yaml file.

     wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/schedule-common-services.yaml
    
  2. Update the schedule-common-services.yaml file based on your backup requirements. For more information, see Velero Schedule API Type Opens in a new tab. By default, the backup runs once a day and is deleted 48 hours later.

    The following configurations in the schedule-common-services.yaml file are important:

    • schedule:, which is a CRON expression. CRON uses the server time, which is usually the Coordinated Universal Time unless configured to be something else.
    • ttl, which is the time to live for the backup.
    • storageLocation, which is the same storage location that you used when you set up OADP. The command oc get backupstoragelocations.velero.io -n <velero namespace> can be used to get the name.
    • velero, which is the namespace where you installed OADP.
  3. Create the resource.

     oc apply -f schedule-common-services.yaml
    
  4. Verify whether the backup schedule was created.

     velero schedule get
    

    After the first scheduled time passes, you can verify whether the backup ran. Look for a schedule name and timestamp.

     velero backup get
    
  5. Verify whether the backup was successful and check the details to see if all resources are saved.

     velero backup describe <__BACKUP_NAME__> --details
    

Restoring foundational services

Complete the following steps to restore foundational services.

Before you restore foundational services, set up Velero on the new cluster. Follow the instructions in the Create the backup resources section.

For troubleshooting issues that may arise during restore, see IBM Cloud Pak foundational services Installation Troubleshooting.

Download the necessary files for restoring different resources:

wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-namespace.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-entitlementkey.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-pull-secret.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-catalog.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-operatorgroup.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-configmap.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-crd.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-commonservice.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-subscriptions.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-licensing.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-cert-manager.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-operands.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-cs-db.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-zen5-data.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-singleton-subscriptions.yaml
  1. Restore the foundational services namespaces by using the restore-namespace.yaml file.

    1. Get the name of the Velero backup that you plan to use for restoring.

       velero backup get
      

      Replace __BACKUP_NAME__ in the following commands with the Velero backup name.

      Verify whether the backup was successful and check the details to see if all resources are saved.

       velero backup describe <__BACKUP_NAME__> --details
      
    2. Substitute the __BACKUP_NAME__ with the name of the backup resource that you created in a previous step.

       vi restore-namespace.yaml
      
    3. Restore the namespaces.

       oc apply -f restore-namespace.yaml
      

      You can check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as Completed.

       velero restore get
      
       velero restore describe <__RESTORE_NAME__> --details
      
    4. Verify whether the namespaces are restored. Your namespaces must be listed in the command output.

       oc get namespace
      

      Proceed with the next step after the namespaces are restored.

    5. Change the default project to the restored common service namespace.

       oc project <namespace-where-foundational services-are-installed>
      
  2. Restore the entitlement key.

    1. Substitute the __BACKUP_NAME__ with the name of the backup resource that you created in a previous step.

      vi restore-entitlementkey.yaml
      
    2. Restore the entitlement key.

      oc apply -f restore-entitlementkey.yaml
      
    3. Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as Completed.

       velero restore get
      
       velero restore describe <__RESTORE_NAME__> --details
      
    4. Verify whether the entitlement key is restored.

       oc get secret
      
  3. Restore the pull secret.

    1. Save the current pull secret.

       oc get secret pull-secret -n openshift-config -o yaml > original-pull-secret.yaml
      
    2. Delete the current pull secret from the openshift-config namespace.

      oc delete secret pull-secret -n openshift-config
      
    3. Substitute the __BACKUP_NAME__ with the name of the backup resource that you created in a previous step.

      vi restore-pull-secret.yaml
      
    4. Restore the pull secret.

       oc apply -f restore-pull-secret.yaml
      
    5. Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as Completed.

       velero restore get
      
       velero restore describe <__RESTORE_NAME__> --details
      
    6. Verify whether the pull secret is restored.

       oc get secret -n openshift-config | grep pull
      
  4. Restore the catalog.

    1. Substitute the __BACKUP_NAME__ with the name of the backup resource that you created in a previous step.

      vi restore-catalog.yaml
      
    2. Restore the catalog.

      oc apply -f restore-catalog.yaml
      
    3. Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as Completed.

       velero restore get
      
       velero restore describe <__RESTORE_NAME__> --details
      
    4. Verify whether the catalog source is restored.

       oc get catalogsource -n openshift-marketplace | grep ibm
      
    5. Verify whether the ibm-operator-catalog pod is running.

       oc get pod -n openshift-marketplace -w
      

      Note: If using IBM Cert Manager, IBM Licensing, or cloud-native-postgresql-catalog catalog source, verify that their pods are Running as well.

      If the pods are running, proceed with the next step.

  5. Restore the operator groups.

    1. Substitute the __BACKUP_NAME__ with the name of the backup resource that you created in a previous step.

      vi restore-operatorgroup.yaml
      
    2. Restore the operator groups.

      oc apply -f restore-operatorgroup.yaml
      
    3. Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as Completed.

       velero restore get
      
       velero restore describe <__RESTORE_NAME__> --details
      
    4. Verify whether the operator groups are restored.

       oc get operatorgroup -A
      
  6. Restore the common-service-maps configmap.

    1. Substitute the __BACKUP_NAME__ with the name of the backup resource that you created in a previous step.

      vi restore-configmap.yaml
      
    2. Restore the configmap.

      oc apply -f restore-configmap.yaml
      
    3. Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as Completed.

       velero restore get
      
       velero restore describe <__RESTORE_NAME__> --details
      
    4. Verify whether the configmap is restored.

       oc get configmap common-service-maps -n kube-public
      
  7. Restore the commonservices.operator.ibm.com customresourcedefinition (CRD).

    1. Substitute the __BACKUP_NAME__ with the name of the backup resource that you created in a previous step.

      vi restore-crd.yaml
      
    2. Restore the CRD.

      oc apply -f restore-crd.yaml
      
    3. Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as Completed.

       velero restore get
      
       velero restore describe <__RESTORE_NAME__> --details
      
    4. Verify whether the CRD is restored.

       oc get customresourcedefinition | grep commonservices.operator.ibm.com
      
  8. Restore the CommonService CRs.

    1. Substitute the __BACKUP_NAME__ with the name of the backup resource that you created in a previous step.

       vi restore-commonservice.yaml
      
    2. Restore the CRs.

       oc apply -f restore-commonservice.yaml
      
    3. Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as Completed.

       velero restore get
      
       velero restore describe <__RESTORE_NAME__> --details
      
    4. Verify whether the CommonService CRs are restored.

       oc get commonservice -A
      

      If a CommonService CR is not restored, delete the restore resource and apply it again:

      1. Delete the resource.

         oc delete -f restore-commonservice.yaml
        
      2. Restore the CR.

         oc apply -f restore-commonservice.yaml
        

        Wait for 30 seconds and check again for the CommonService resource.

  9. Restore the singleton subscriptions.

    1. Substitute the __BACKUP_NAME__ with the name of the backup resource that you created in a previous step.

       vi restore-singleton-subscriptions.yaml
      
    2. Restore the subscriptions.

       oc apply -f restore-singleton-subscriptions.yaml
      
    3. Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as Completed.

       velero restore get
      
       velero restore describe <__RESTORE_NAME__> --details
      
    4. Watch the namespaces where Cert Manager and License Service are deployed for the Cert Manager and Licensing operators to be running. By default Cert Manager and License Service are deployed in ibm-cert-manager and ibm-licensing namespaces.

       oc get pod -n <cs namespace> -w
      
  10. Restore cert manager resources.

    1. Substitute the __BACKUP_NAME__ with the name of the backup resource that you created in a previous step.

       vi restore-cert-manager.yaml
      
    2. Restore the cert manager resource.

       oc apply -f restore-cert-manager.yaml
      
    3. Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as Completed.

       velero restore get
      
       velero restore describe <__RESTORE_NAME__> --details
      
    4. Verify whether the certificates are restored.

       oc get certificates
      
  11. Restore the subscriptions.

    1. Substitute the __BACKUP_NAME__ with the name of the backup resource that you created in a previous step.

      vi restore-subscriptions.yaml
      
    2. Restore the subscriptions.

      oc apply -f restore-subscriptions.yaml
      
    3. Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as Completed.

       velero restore get
      
       velero restore describe <__RESTORE_NAME__> --details
      
    4. Watch the foundational services namespace for the operand-deployment-lifecycle-manager to be running:

       oc get pod -n <cs namespace> -w
      

    See the following notes:

    • If not using IBM Cert Manager, IBM Common Service Operator deployment fails unless a third-party Cert Manager is installed on the cluster beforehand.
    • If using SOD, it is likely the ibm-common-service-operator will not come ready after restoring the subscriptions and subsequently will not deploy ODLM. This is expected and will resolve after running the next step.

    Troubleshooting: In case of issues with generating new installation plans for updates or new installations, see OLM is unable to generate new install plans.

  12. Run setup_tenant.sh to set up cluster topology.

    1. Get the setup.tenant.sh and utils.sh scripts by running the following command:

       wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts-adopter/cp3pt0-deployment/setup_tenant.sh
       mkdir common && cd common
       wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts-adopter/cp3pt0-deployment/common/utils.sh
      

      Note: This script needs to be run for each instance of foundational services. Each instance should have different namespace values from each other instance (that is, no namespace should be used in two different executions).

    2. Run the following command the make the scricpts executable:

       chmod +x setup_tenant.sh
       chmod +x common/utils.sh
      
    3. Gather the values to run script. Operator and Services namespaces:

       oc get commonservice common-service -o yaml
      

      Locate values .spec.operatorNamespace and .spec.servicesNamespace

      Note: These values will always match unless using SOD.

      Size:

       oc get commonservice common-service -o yaml
      

      Locate value .spec.size

      Tethered namespaces:

       oc get cm common-service-maps -o yaml -n kube-public
      

      Make note of the namespaces under requested-from-namespace. If this configmap does not exist, this value consists of whichever namespaces are going to use this common service instance.

    4. Run the script.

      Note: If services and operator namespace are the same, you must still specify both parameters when running setup_tenant.sh. In this case, use the same namespace for each parameter. Optional parameters -s and -n can be used if either using a different catalog source than opencloud-operators or if the catalog source is in a different namespace respectively. If everything is deployed to the same namespace (CS operators, CS operands, and Cloud Pak workload), you do not need to use the setup_tenant.sh script and can move on to the next step.

       ./setup_tenant.sh --operator-namespace <operator namespace> --services-namespace <services namespace> --tethered-namespaces <comma delimited (no spaces) list of Cloud Pak workload namespaces that use this foundational services instance> --license-accept -c v<foundational services version number in use i.e. 4.0, 4.1, 4.2, etc> -p <.spec.size value from `CommonService` cr> -i <install mode, either Manual or Automatic>
      
    5. Wait for script to complete successfully. For more information, see Installing foundational services by using a script.

  13. Restore Licensing service configmap.

    1. Substitute the __BACKUP_NAME__ with the name of the backup resource that you created in a previous step.

       vi restore-licensing.yaml
      
    2. Restore the configmap.

       oc apply -f restore-licensing.yaml
      
    3. Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as Completed.

       velero restore get
      
       velero restore describe <__RESTORE_NAME__> --details
      
    4. Verify whether the configmap is restored.

       oc get configmap | grep licensing
      
  14. If you use IBM License Service Reporter, see Backing up the License Service Reporter instance.

  15. Restore the OperandRequests and OperandConfigs.

    1. If you are restoring an OperandConfig, delete the existing first.

       oc delete operandconfig common-service
      
    2. Substitute the __BACKUP_NAME__ with the name of the backup resource that you created in a previous step.

       vi restore-operands.yaml
      
    3. Restore the operands.

       oc apply -f restore-operands.yaml
      
    4. Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as Completed.

       velero restore get
      
       velero restore describe <__RESTORE_NAME__> --details
      
    5. Verify whether the operands are restored.

       oc get operandrequest
      
       oc get operandconfig
      
    6. Verify whether operand requests are reconciled.

      Note: Give ODLM time to reconcile one or more restored operand requests but new operators and their operands should deploy shortly after the restore completes. Check the status fields of the operand requests and the ODLM logs for any issues.

    7. If using a custom hostname, TLS secret, or both, wait for the platform-identity pods to come ready:

      • Verify that the cs-onprem-tenant-config configmap is present:

          oc get cm -n <namespace where hostname is changed or custom TLS secret used> | grep cs-onprem-tenant-config
        
      • Wait for the platform-identity-management, platform-identity-provider, and platform-auth-service pods to come ready in the same namespace.

      • Make sure to update the custom hostname to reflect a change in cluster if necessary. For example, the structure of the route is <route name>.cluster1.com. If you are no longer on cluster1 but now on cluster2, the route needs to be updated from <route name>.cluster1.com to <route name>.cluster2.com.
      • If using a custom TLS secret, it is best to re-create this secret on the new cluster by using the same name. In this case, if the secret was carried over to the new cluster, it would need to be replaced.
      • Follow the instructions here https://www.ibm.com/docs/en/cloud-paks/foundational-services/4.3?topic=cc-updating-custom-hostname-tls-secret-by-using-configmap.
  16. Restore common-service-db

    1. Get the restore object

      wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-cs-db.yaml
      
    2. Substitute the __BACKUP_NAME__ with the name of the backup resource that you created previously.

      vi restore-cs-db.yaml
      
    3. Restore the cs-db data.

      oc apply -f restore-cs-db.yaml
      
    4. Check restore progress. Proceed with the next step after restore is complete.

      velero restore get
      
    5. Check logs of the velero restore to verify that the data was restored

      velero restore logs restore-zen5-data
      

    Troubleshooting: If the logs or the data indicate that the restore was not successful, apply the following workaround:

    1. Get the restore job:

        wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/common-service-db/cs-db-restore-job.yaml
      
    2. Replace <cs-db namespace> with the namespace where common-service-db instance is running.

    3. Delete the existing cs-db-backup deployment and cs-db-backup pod.

        oc delete deploy cs-db-backup -n <namespace>
      
    4. Run the restore job.

        oc apply -f cs-db-restore-job.yaml
      
  17. Restore Zen and Zen data.

    • Restore zenservice instances

      1. Substitute the __BACKUP_NAME__ with the name of the backup resource that you created in a previous step.

         vi restore-zen.yaml
        
      2. Restore the zenservice instances.

         oc apply -f restore-zen.yaml
        
      3. Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as Completed.

         velero restore get
        
         velero restore describe <__RESTORE_NAME__> --details
        
      4. Wait for the zenservice instances to come ready. Once the Progress field is 100%, the instance is ready. The following command will continuously output the percentage of all the zenservices on the cluster.

         oc get zenservice -A -w -o yaml | grep Progress:
        

        Note: If the restored zenservice contains fields to configure zenCustomRoute, do the following:

      5. Verify the secret used (if the field exists) is present in the zenservice namespace in the target cluster.
      6. Update the value in the zenservice CR for the route. For example, the structure of the route is <route name>.cluster1.com. If you are no longer on cluster1 but now on cluster2, the route needs to be updated from <route name>.cluster1.com to <route name>.cluster2.com.
    • Restore zen data.

      1. Substitute the __BACKUP_NAME__ with the name of the backup resource that you created in a previous step.

         vi restore-zen5-data.yaml
        
      2. Restore the Zen data.

         oc apply -f restore-zen5-data.yaml
        
      3. Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as Completed.

         velero restore get
        
         velero restore describe <__RESTORE_NAME__> --details
        
      4. Check logs of the velero restore to verify that the data was restored

         velero restore logs restore-zen5-data
        
        • Search for restore_zen5 to find relevant logs. If it is not present, the restore did not run. If the logs or the data indicate that the restore was not successful, the following steps can be taken as a workaround:

          1. Get the Zen 5 restore job resource.

             wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/zen/zen5-restore-job.yaml
            
          2. Delete the existing zen5-backup deployment

             oc delete deploy zen5-backup -n <namespace>
            
          3. Wait for the zen5-backup pods to fully delete (fully gone, not Terminating)

          4. Give the zen5 backup necessary permissions if the necessary ServiceAccount, Role, and RoleBinding are not already present.

            • Check if permissions exist:

                oc get sa -n <zenservice namespace> | grep zen5
                oc get role | grep zen5
                oc get rolebinding | grep zen5
              
            • Get the zen5-sa.yaml, zen5-role.yaml, & zen5-rolebinding.yaml files.

                wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-sa.yaml
                wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-role.yaml
                wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-rolebinding.yaml
              
            • For each namespace with a zenservice to backup, edit the service account file zen5-sa.yaml to deploy in the corresponding namespace

                oc apply -f zen5-sa.yaml
              
            • Once per zenservice namespace, apply the zen5-role.yaml file to create the Role for the zen backup. Replace the <zenservice namespace> value before applying.

                oc apply -f zen5-role.yaml
              
            • Create the RoleBinding to connect the ServiceAccounts to the Role.

              1. Edit the zen5-rolebinding.yaml file to add each ServiceAccount created earlier. Replace the <zenservice namespace> value before applying.

                 vi zen5-rolebinding.yaml
                
              2. Apply the zen5-rolebinding.yaml file

                 oc apply -f zen5-rolebinding.yaml
                
          5. Edit the zen5-restore-job.yaml file. The default namespace is set to zen. The parameters for the underlying restore_zen5.sh are defaulted to the zen namespace and test-zen zenservice name. Update both of these parameters to reflect the proper namespace and zenservice respectively.

          6. Apply the zen5-restore-job.yaml file

             oc apply -f zen5-restore-job.yaml
            
          7. Wait for the job to complete, then check the logs of the zen5-restore-job pod to verify restore completed.

          8. Repeat as needed for each namespace with a zenservice instance installed.

      5. Wait for the zenservice instances to come ready. Once the Progress field is 100%, the instance is ready. The following command will continuously output the percentage of all the zenservices on the cluster.

         oc get zenservice -A -w -o yaml | grep Progress:
        

        See the following troubleshooting tips:

      6. Make sure that there is only one zen5-backup or one zen5-restore-job pod in a namespace at any given time as they compete for the same PVC.

      7. If the zen5-restore-job pod is stuck in ContainerCreating:
        1. delete the deployment zen5-backup
        2. make sure the zen5-backup pod is fully deleted (not Terminating)
        3. delete the zen5-restore-job job and its pod (not Terminating)
        4. ensure that the configmap zen5-br-configmap, pvc zen5-backup-pvc, role zen5-backup-role, rolebinding zen5-backup-rolebinding, and service account zen5-backup-sa are present in the namespace
        5. reapply the zen5-restore-job yaml
      8. If the configmap zen5-br-configmap is not present, it can be downloaded from:

          wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-br-scripts-cm.yaml.
        

        Make sure to edit the namespace field before applying with the following command:

          oc apply -f zen5-br-scripts-cm.yaml
        
      9. Velero restore is less predictable than backup when restoring databases. There is no harm to deleting a velero restore object (that is, restore-cs-db-data or restore-zen5-data), deleting the accompanying deployment and pvc, waiting for these items to be fully deleted, then re-creating the velero restore object to try again. Should this still not work, the secondary instructions that use the cs-db-restore-job.yaml and zen5 restore job can be used on an individual namespace basis. There is no harm to running the restore in a namespace that has already been restored.

  18. If you use a custom route for the restored zenservice and you are restoring to a new cluster, update the value of zenCustomRoute in the zenservice CR to reflect the new hostname and re-trigger the iam-config job. Run the following commands:

    oc -n <zenservice namespace> patch zenservice  <zenservice name>  --type='merge' -p '{"spec":{"zenCustomRoute":{"route_host":"<updated route>"}}}'
    oc -n  <zenservice namespace> patch zenservice <zenservice name>  --type='merge' -p '{"spec":{"reconcile":true}}'
    oc get job -n  <zenservice namespace> iam-config-job -o json | jq 'del(.spec.selector)' | jq 'del(.spec.template.metadata.labels)' | oc replace --force -f -
    

All restoration tasks are completed.

Verify whether foundational services are properly restored.

For backing up and restoring Identity Management (IM) components, see Identity management backup and restore.

For migrating existing OIDC and SAML configurations, see Migrating identity management.

General Troubleshooting:

If a restore process is stopped in the New phase when you view with velero restore get, restart the velero pod in the namespace where OADP is installed. After the velero pod restarts, the status of the restore process must change to InProgress.