Creating an offline backup of a Cloud Pak for Data instance and restoring it on a different cluster

You can create an offline backup of an IBM Cloud Pak® for Data instance and restore it on a different cluster with the Cloud Pak for Data OADP backup and restore utility.
A Cloud Pak for Data instance is comprised of one or more Red Hat® OpenShift® Container Platform projects (Kubernetes namespaces). For example, your Cloud Pak for Data instance can be either:
  • A single project in which the control plane and services are installed
  • A central project in which the control plane is installed and one or more tethered projects

If your deployment consists of multiple instances of Cloud Pak for Data on the same cluster, you can back up and restore each instance separately. You can use this process to recreate the entire deployment on a new cluster that you can use in the event that a disaster destroys your source cluster.

Permissions you need for this task
You must log on as a user with cluster administrator rights.
Best practice: You can run the commands in this task exactly as written if you set up environment variables. For instructions, see Setting up installation environment variables.

Ensure that you source the environment variables before you run the commands in this task.

To back up a Cloud Pak for Data instance and restore it on a different cluster, complete the following high-level steps:

  1. Back up the instance of Cloud Pak for Data on the source cluster.
  2. Back up the operators on the source cluster.
  3. If Cloud Pak for Data is integrated with the IAM Service, back up the IAM Service on the source cluster.
  4. Set up a new cluster for the restore.
  5. Restore the operators on the new cluster.
  6. If Cloud Pak for Data is integrated with the IAM Service, restore the IAM Service on the new cluster.
  7. Restore the instance of Cloud Pak for Data on the new cluster.

Backing up an instance of Cloud Pak for Data on the source cluster

To back up an instance of Cloud Pak for Data, do the following steps:

  1. Complete backup prerequisite tasks.
  2. Create the backup by running the following command.

    If your instance is comprised of a central project and one or more tethered projects, replace the ${PROJECT_CPD_INSTANCE} environment variable with a comma separated list of projects. For example: --include-namespaces=cpd-instance,cpd-instance-tether1,cpd-instance-tether2.

    cpd-cli oadp backup create <instance_backup_name> \
    --include-namespaces=${PROJECT_CPD_INSTANCE} \
    --exclude-resources='Event,Event.events.k8s.io' \
    --default-volumes-to-restic \
    --cleanup-completed-resources \
    --log-level=debug \
    --verbose
  3. Optional: In case you later need to restore them, back up the operators.
    Tip: The steps to restore the operators in the same cluster are the same as restoring them on a different cluster.

Backing up the operators and operator projects on the source cluster

You must back up the operators that are installed on the source cluster. The location of the operators depend on the type of installation that you performed:
Express installation
In an express installation, the IBM Cloud Pak foundational services operators and the IBM Cloud Pak for Data platform operator and service operators are installed in the same project, typically ibm-common-services.
Specialized installation
In a specialized installation, the IBM Cloud Pak foundational services operators and the IBM Cloud Pak for Data platform operator and service operators are installed in different projects. For example:
  • The IBM Cloud Pak foundational services operators are typically installed in ibm-common-services.
  • The IBM Cloud Pak for Data platform operator and service operators are installed in a different project, such as cpd-operators.
Important: To back up the operators, you must get the cpd-operators.sh script from the IBM® cpd-cli GitHub repository and the jq JSON utility. For details, see Installing the Cloud Pak for Data OADP backup and restore utility.

To back up the operators on the source cluster:

  1. Log in to Red Hat OpenShift Container Platform as a user with sufficient permissions to complete the task:
    oc login ${OCP_URL}
  2. Copy the cpd-operators.sh script to the machine from which you are connecting to the cluster and make the script executable:
    chmod 700 cpd-operators.sh
  3. Run the cpd-operators.sh script to create a configMap called cpd-operators. The configMap captures the required Kubernetes objects that you will need when you restore the operators.

    The projects that you specify when you run the script depend on the type of installation that you performed:


    Express installation

    The following command assumes that the operators are installed in the ibm-common-services project:

    ./cpd-operators.sh backup --foundation-namespace ibm-common-services --operators-namespace ibm-common-services

    Confirm that the cpd-operators configMap was created:

    oc get configmap cpd-operators -n ibm-common-services

    Specialized installation

    The following command assumes that the IBM Cloud Pak foundational services operators are installed in the ibm-common-services project and the Cloud Pak for Data operators are installed in the cpd-operators project:

    ./cpd-operators.sh backup --foundation-namespace ibm-common-services --operators-namespace cpd-operators

    Confirm that the cpd-operators configMap was created:

    oc get configmap cpd-operators -n cpd-operators

  4. Back up appropriate operator projects. The projects that you must back up depend on the type of installation that you performed:
    Important: Do not change list of resources that are specified in the --include-resources list.

    Express installation

    The following command assumes that the operators are installed in the ibm-common-services project:

    cpd-cli oadp backup create <operators_backup_name> \
    --include-namespaces ibm-common-services \
    --include-resources='namespaces,operatorgroups,configmaps,scheduling,crd' \
    --skip-hooks \
    --log-level=debug \
    --verbose

    Replace <operators_backup_name> with the name you want to use to identify the backup.


    Specialized installation

    The following command assumes that the IBM Cloud Pak foundational services operators are installed in the ibm-common-services project and the Cloud Pak for Data operators are installed in the cpd-operators project:

    cpd-cli oadp backup create <operators_backup_name> \
    --include-namespaces ibm-common-services,cpd-operators \
    --include-resources='namespaces,operatorgroups,configmaps,scheduling,crd' \
    --skip-hooks \
    --log-level=debug \
    --verbose

    Replace operators_backup_name with the name you want to use to identify the backup.


Backing up the IAM Service on the source cluster

If the Cloud Pak for Data instance is integrated with the IAM Service, you must back up the IAM Service on the source cluster.

Before you begin
Before you back up the IAM Service, ensure that you completed:
  1. Backing up the operators and operator projects on the source cluster
  1. Log in to Red Hat OpenShift Container Platform as a user with sufficient permissions to complete the task:
    oc login ${OCP_URL}
  2. Run the mongodb-backup job. You can either copy the mongo-backup.sh script from the IBM cpd-cli GitHub repository or you can run the script from quay.io.
    Method Instructions
    Run a local copy of the mongo-backup.sh script
    Important: To use the script, you must copy the following files from the IBM cpd-cli GitHub repository:
    • mongo-backup.sh
    • mongo-job-rbac.yaml
    • mongo-backup-job.yaml

    For details, see Downloading the Cloud Pak for Data backup and restore scripts.

    1. Make the mongo-backup.sh script executable:
      chmod 700 mongo-backup.sh
    2. If IBM Cloud Pak foundational services is not installed in the ibm-common-services project, replace the ibm-common-service project in the mongo-job-rbac.yaml file.
      1. To put the file in edit mode, run:
        vi mongo-job-rbac.yaml
      2. Change the value of the namespace parameter.
      3. To save your changes, hit esc and enter :wq.
      4. Run the following command to ensure that the mongo-backup.sh script uses the correct project:
        export CS_NAMESPACE=<foundational_services_project_name>
    3. Create the cs-br cluster role binding:
      oc apply -f mongo-job-rbac.yaml
    4. Run the mongo-backup.sh script:
      ./mongo-backup.sh
    Run the script from quay.io
    1. Create the cs-br cluster role binding:
      cat <<EOF |oc apply -f -
      kind: ClusterRoleBinding
      apiVersion: rbac.authorization.k8s.io/v1
      metadata:
        name: cs-br
      roleRef:
        kind: ClusterRole
        name: cluster-admin
        apiGroup: rbac.authorization.k8s.io
      subjects:
      - kind: ServiceAccount
        name: default
        namespace: ${PROJECT_CPFS_OPS}
      EOF
    2. Create the cs-backup-job job to run the mongo-backup.sh script from quay.io.
      cat <<EOF |oc apply -f -
      apiVersion: batch/v1
      kind: Job
      metadata:
        name: cs-backup-job
        namespace: ${PROJECT_CPFS_OPS|
      spec:
        template:
          spec:
            containers:
              - name: cs-mongo-br
                image: quay.io/opencloudio/cs-mongodb-br:v0.2
                command: ["./mongo-backup.sh"]
                env:
                  - name: CS_NAMESPACE
                    valueFrom:
                      fieldRef:
                        fieldPath: metadata.namespace
            restartPolicy: Never
      EOF
  3. Delete the mongdb-backup job.
    oc delete job mongodb-backup -n ${PROJECT_CPFS_OPS}
  4. Set the CPDBR_ENABLE_FEATURES environment variable:
    export CPDBR_ENABLE_FEATURES=experimental
  5. Back up the foundational services project, filtered by --pvc-data-include-labels.
    Important: Do not change list of resources that are specified in the --include-resources list or the --pvc-data-include-labels list.
    cpd-cli oadp backup create <foundational_iam_backup_name> \
    --include-namespaces ${PROJECT_CPFS_OPS} \
    --include-resources='ns,deploy,po,pvc,pv' \
    --default-volumes-to-restic \
    --pvc-data-include-labels=app=icp-bedrock-backup \
    --skip-hooks \
    --snapshot-volumes=false \
    --log-level=debug \
    --verbose

    Replace <foundational_iam_backup_name> with the name you want to use to identify the backup.

Setting up the new cluster

After you create a new cluster, you must set up and configure the cluster so that you can restore the operators and the instance of Cloud Pak for Data on the cluster.

Important: To restore the operators and the instance of Cloud Pak for Data that you backed up when you completed the preceding tasks, the target cluster must have the same configuration as the source cluster.

Ensure that the following statements are true:

  1. The target cluster has the same storage classes as the source cluster. For details on setting up storage, see Setting up persistent storage.
  2. For environments that use a private container registry, such as air-gapped environments, the target cluster has the same image content source policy as the source cluster. For details on configuring the image content source policy, see Configuring an image content source policy.
  3. The target cluster must be able to pull software images. For details, see Updating the global image pull secret.
  4. The OpenShift APIs for Data Protection (OADP) backup and restore utility is installed on the target cluster. For details, see Installing the Cloud Pak for Data OADP backup and restore utility.
  5. The deployment environment of the target cluster is the same as the source cluster:
    • The target cluster uses the same hardware architecture as the source cluster. For example, x86-64.
    • The target cluster is on the same OpenShift version as the source cluster.
    • The target cluster allows for the same node configuration as the source cluster. For example, if the source cluster uses a custom KubeletConfig, the target cluster must allow the same custom KubeletConfig. For more information about node settings, see Changing required node settings.

      Moving between IBM Cloud and non-IBM Cloud deployment environments is not supported.

Restoring the operators and operator projects on the new cluster

After you configure the new cluster, you can restore the operators and operator projects on the new cluster.

Important: To restore the operators, you must get the cpd-operators.sh script from the IBM cpd-cli GitHub repository and the jq JSON utility. For details, see Installing the Cloud Pak for Data OADP backup and restore utility.
  1. Log in to Red Hat OpenShift Container Platform as a user with sufficient permissions to complete the task:
    oc login ${OCP_URL}
  2. To view a list of existing backups, run the following command:
    cpd-cli oadp backup ls
  3. Restore any CustomResourceDefinitions (CRDs):
    cpd-cli oadp restore create <operators_restore_name1> \
    --from-backup=<operators_backup_name> \
    --include-resources='crd' \
    --include-cluster-resources=true \
    --skip-hooks \
    --log-level=debug \
    --verbose
  4. Restore the required resources, such as the required projects and operator groups, on the target cluster:
    cpd-cli oadp restore create <operators_restore_name2> \
    --from-backup=<operators_backup_name> \
    --include-resources='namespaces,operatorgroups,scheduling,crd' \
    --include-cluster-resources=true \
    --skip-hooks \
    --log-level=debug \
    --verbose

    Replace <operators_backup_name> with the name that you specified when you created the backup.

  5. Restore the Kubernetes objects that were included in the cpd-operators configMap:
    cpd-cli oadp restore create <operators_restore_name3> \
    --from-backup=<operators_backup_name> \
    --include-resources='configmaps' \
    --selector 'app=cpd-operators-backup' \
    --skip-hooks \
    --log-level=debug \
    --verbose
  6. Confirm that the Cloud Pak for Data operator project contains the required OperatorGroup and configMap resources:
    Express installations
    1. Confirm that the project contains the operatorgroup operator group.

      The following command assumes that the Cloud Pak for Data operators are installed in the ibm-common-services project:

      oc get operatorgroups -n ibm-common-services
    2. Confirm that the project contains the cpd-operators configMap.

      The following command assumes that the Cloud Pak for Data operators are installed in the ibm-common-services project:

      oc get configmaps cpd-operators -n ibm-common-services

    Specialized installations
    1. Confirm that the project contains the operatorgroup operator group.

      The following command assumes that the Cloud Pak for Data operators are installed in the cpd-operators project:

      oc get operatorgroups -n cpd-operators
    2. Confirm that the project contains the cpd-operators configMap.

      The following command assumes that the Cloud Pak for Data operators are installed in the cpd-operators project:

      oc get configmaps cpd-operators -n cpd-operators

  7. Copy the cpd-operators.sh script to the machine from which you are connecting to the target cluster and make the script executable:
    chmod 700 cpd-operators.sh
  8. Run the cpd-operators.sh script to restore the operators on the target cluster.

    The projects that you specify when you run the script depend on the type of installation that you performed on the source cluster:


    Express installations

    The following command assumes that all of the operators were installed in the ibm-common-services project on the source cluster:

    ./cpd-operators.sh restore --foundation-namespace ibm-common-services --operators-namespace ibm-common-services

    Specialized installations

    The following command assumes that the IBM Cloud Pak foundational services were installed in the ibm-common-services project and the Cloud Pak for Data operators were installed in the cpd-operators project on the source cluster:

    ./cpd-operators.sh restore --foundation-namespace ibm-common-services --operators-namespace cpd-operators

Restoring the IAM Service on the new cluster

If the Cloud Pak for Data instance is integrated with the IAM Service, restore the IAM Service on the new cluster.

Before you begin
Before you can restore the IAM Service on the new cluster, ensure that you completed:
  1. Setting up the new cluster
  2. Restoring the operators and operator projects on the new cluster
Note: When foundational services and Cloud Pak for Data operators are co-located in the same project, the cpd-cli oadp backup command targets the same project as in Restoring the operators and operator projects on the new cluster, but with different --include-resources and filter.
  1. Log in to Red Hat OpenShift Container Platform as a user with sufficient permissions to complete the task:
    oc login ${OCP_URL}
  2. Restore the persistent volumes and the volumes that are associated with the IAM Service.
    cpd-cli oadp restore create <foundational_iam_restore_name1> \
    --from-backup=<foundational_iam_backup_name> \
    --include-resources='pvc,pv' \
    --skip-hooks \
    --log-level=debug \
    --verbose
  3. Restore the data for the IAM Service.
    cpd-cli oadp restore create <foundational_iam_restore_name2> \
    --from-backup=<foundational_iam_backup_name> \
    --include-resources='pv,pvc,deploy,po' \
    --selector 'app=cpdbr-vol-mnt' \
    --skip-hooks \
    --log-level=debug \
    --verbose
  4. Run the mongodb-restore job. You can either copy the mongo-restore.sh script from the IBM cpd-cli GitHub repository or you can run the script from quay.io.
    Method Instructions
    Run a local copy of the mongo-restore.sh script
    Important: To use the script, you must copy the following files from the IBM cpd-cli GitHub repository:
    • mongo-restore.sh
    • mongo-job-rbac.yaml
    • mongo-restore-job.yaml

    For details, see Downloading the Cloud Pak for Data backup and restore scripts.

    1. Make the mongo-restore.sh script executable:
      chmod 700 mongo-restore.sh
    2. If IBM Cloud Pak foundational services is not installed in the ibm-common-services project, replace the ibm-common-service project in the mongo-job-rbac.yaml file.
      1. To put the file in edit mode, run:
        vi mongo-job-rbac.yaml
      2. Change the value of the namespace parameter.
      3. To save your changes, hit esc and enter :wq.
      4. Run the following command to ensure that the mongo-backup.sh script uses the correct project:
        export CS_NAMESPACE=<foundational_services_project_name>
    3. Create the cs-br cluster role binding:
      oc apply -f mongo-job-rbac.yaml
    4. Run the mongo-restore.sh script:
      ./mongo-restore.sh
    Run the script from quay.io
    1. Create the cs-br cluster role binding:
      cat <<EOF |oc apply -f -
      kind: ClusterRoleBinding
      apiVersion: rbac.authorization.k8s.io/v1
      metadata:
        name: cs-br
      roleRef:
        kind: ClusterRole
        name: cluster-admin
        apiGroup: rbac.authorization.k8s.io
      subjects:
      - kind: ServiceAccount
        name: default
        namespace: ${PROJECT_CPFS_OPS|
      EOF
    2. Create the cs-restore-job job to run the mongo-restore.sh script from quay.io.
      cat <<EOF |oc apply -f -
      apiVersion: batch/v1
      kind: Job
      metadata:
        name: cs-restore-job
        namespace: ${PROJECT_CPFS_OPS|
      spec:
        template:
          spec:
            containers:
              - name: cs-mongo-br
                image: quay.io/opencloudio/cs-mongodb-br:v0.2
                command: ["./mongo-restore.sh"]
                env:
                  - name: CS_NAMESPACE
                    valueFrom:
                      fieldRef:
                        fieldPath: metadata.namespace
            restartPolicy: Never
      EOF

Restoring the instance of Cloud Pak for Data on the new cluster

Before you begin
Before you can restore the instance of Cloud Pak for Data on the new cluster, ensure that you completed:
  1. Setting up the new cluster
  2. Restoring the operators and operator projects on the new cluster
  3. (If applicable) Restoring the IAM Service on the new cluster

To restore the instance of Cloud Pak for Data on the new cluster, do the following steps.

  1. Log in to Red Hat OpenShift Container Platform as a user with sufficient permissions to complete the task:
    oc login ${OCP_URL}
  2. Specialized installations only: Enable the IBM Cloud Pak for Data platform operator to watch the project where you are restoring the instance of Cloud Pak for Data.

    Update the cpd-operators NamespaceScope resource in the Cloud Pak for Data operator project to watch the project where you are restoring the instance.

    Tip: To retrieve the existing projects in the namespaceMembers list, run the following command:
    oc get namespacescope cpd-operators -n ${PROJECT_CPD_OPS} -o jsonpath={.spec.namespaceMembers}

    Edit the namespaceMembers list to add the project where you are restoring the instance. For example, if you are restoring the instance to the cpd-instance project, add that project to the list. If your instance is comprised of a central project and one or more tethered projects, you must add all of those projects to the list. For example, cpd-instance-tether1, cpd-instance-tether2.

    cat <<EOF |oc apply -f -
    apiVersion: operator.ibm.com/v1
    kind: NamespaceScope
    metadata:
      name: cpd-operators
      namespace: cpd-operators        # (Default) Replace with the Cloud Pak for Data platform operator project name 
    spec:
      csvInjector:                    # This setting is required for some services. Do not delete this line if you specified it when you created operator subscriptions. 
        enable: true                  # This setting is required for some services. Do not delete this line if you specified it when you created operator subscriptions. 
      namespaceMembers:
      - cpd-operators                 # (Default) Replace with the Cloud Pak for Data platform operator project name
      - cpd-instance                  # Replace with the project where you are restoring Cloud Pak for Data
    EOF
  3. Restore the Cloud Pak for Data instance.
    1. Restore the ZenService resource:
      cpd-cli oadp restore create <instance_backup_name>-zenservice-restore \
      --from-backup=<instance_backup_name> \
      --include-resources='namespaces,zenservices,secrets,certificates.cert-manager.io,certificates.certmanager.k8s.io,issuers.cert-manager.io,issuers.certmanager.k8s.io' \
      --skip-hooks \
      --log-level=debug \
      --verbose
    2. Restore all other backed up resources:
      cpd-cli oadp restore create <instance_backup_name>-restore \
      --from-backup=<instance_backup_name> \
      --exclude-resources='clients,ImageTag' \
      --include-cluster-resources=true \
      --log-level=debug \
      --verbose
      Note: If you are restoring Db2® or Watson™ Knowledge Catalog, add --scale-wait-timeout 15m to ensure that the restore command completes successfully.