Restoring Cloud Pak for Data volume backups from a persistent volume claim or object store

If you created volume backups of your IBM Cloud Pak® for Data deployment in a separate PersistentVolumeClaim (PVC) or an S3 or S3-compatible object store with the Cloud Pak for Data volume backup and restore utility, use this information to restore the volumes.

Before you begin

Cloud Pak for Data provides the cpd-cli backup-restore command-line interface for backing up and restoring PVs. Before you run any cpd-cli backup-restore commands, ensure that:

About this task

During the restore process, write operations in application workloads are suspended (quiesced). The quiesce command calls hooks provided by Cloud Pak for Data services to do the quiesce. Quiesce hooks that are provided by Cloud Pak for Data services offer optimizations or other enhancements compared to scaling down all resources in the project (namespace). Services might be quiesced and unquiesced in a certain order, or services might be suspended without having to bring down pods to reduce the time it takes to bring down applications and bring them back up.

You can restore volumes in two ways:
  1. Manually scale down resources, restore volumes, and then manually scale up resources.
  2. Automatically scale down resources, restore volumes, and automatically scale up resources with a single command.
Tip: It is a good idea to manually scale down application Kubernetes resources before you do a restore, so that you can find out whether a pod's services cannot scale down or up correctly. You can then do the restore after you fix any problems that were found.
Best practice: You can run the commands in this task exactly as written if you set up environment variables. For instructions, see Setting up installation environment variables.

Ensure that you source the environment variables before you run the commands in this task.

For more information about the Cloud Pak for Data volume backup and restore utility, including a list of commands that you can run, see the cpd-cli backup-restore reference documentation.

Procedure

  1. If you haven't already done so, initialize cpd-cli backup-restore.
    Tip: cpd-cli backup-restore is initialized before you create a backup.
    Note: If your Docker image registry is different than what is shown in the following examples, change the appropriate options.

    The following command is an example that initializes cpd-cli backup-restore when you are using an S3 object store to store the backups, and the cluster has access to icr.io/cpopen/cpd.

    # Initialize the cpdbr first with pvc name and s3 storage.  Note that the bucket must exist.
    # Example for cluster with access to ICR
    $ cpd-cli backup-restore init \
    --namespace $NAMESPACE \
    --pvc-name cpdbr-pvc \
    --image-prefix=icr.io/cpopen/cpd \
    --provider=s3 \
    --s3-endpoint="s3 endpoint" \
    --s3-bucket=cpdbr \
    --s3-prefix=$NAMESPACE/

    The following command is an example that initializes cpd-cli backup-restore when you are using an S3 object store to store the backups, in an environment that uses a private image registry, such as when your cluster is air-gapped.

    # Example for air-gapped environment
    $ cpd-cli backup-restore init \
    --namespace $NAMESPACE \
    --pvc-name cpdbr-pvc \
    --image-prefix=${PRIVATE_REGISTRY_LOCATION} \
    --provider=s3 \
    --s3-endpoint="s3 endpoint" \
    --s3-bucket=cpdbr \
    --s3-prefix=$NAMESPACE/
    

    The following command is an example that initializes cpd-cli backup-restore when you are using a separate PVC to store the backups, and the cluster has access to icr.io/cpopen/cpd.

    # Example for cluster with access to ICR
    cpd-cli backup-restore init \
    --namespace $NAMESPACE \
    --log-level=debug \
    --verbose \
    --pvc-name cpdbr-pvc \ 
    --image-prefix=icr.io/cpopen/cpd \
    --provider=local

    The following command is an example that initializes cpd-cli backup-restore when you are using a separate PVC to store the backups, in an environment that uses a private image registry, such as when your cluster is air-gapped.

    # Example for air-gapped environment
    cpd-cli backup-restore init \
    --namespace $NAMESPACE \
    --log-level=debug \
    --verbose \
    --pvc-name cpdbr-pvc \ 
    --image-prefix=${PRIVATE_REGISTRY_LOCATION} \
    --provider=local
  2. To restore volumes by manually scaling down resources, restoring volumes, and manually scaling up resources, do the following steps.
    1. Manually scale down application Kubernetes resources:
      cpd-cli backup-restore quiesce -n ${PROJECT_CPD_INSTANCE}

      If you want to scale down all resources, include the --force option.

    2. Check for completed jobs and pods by running the volume restore command with the --dry-run option, specifying a restore name identifier.

      The --dry-run option reports jobs or pods that are still attached to the PVCs to be restored.

      Note: The restore name identifier must consist of lowercase alphanumeric characters or the hyphen (-), and must start and end with an alphanumeric character. The underscore character (_) is not supported.
      cpd-cli backup-restore volume-restore create <restore_name> --from-backup <backup_name> -n ${PROJECT_CPD_INSTANCE} --dry-run
    3. If the dry run reports completed or failed jobs, or pods, that reference PVCs, delete them.
      Tip: Consider saving the job/pod yaml before you manually delete them, or include the --cleanup-completed-resources option in the restore step.
    4. Run the restore command with the --skip-quiesce option:
      cpd-cli backup-restore volume-restore create <restore_name> --from-backup <backup_name> -n ${PROJECT_CPD_INSTANCE} --skip-quiesce=true
      Note: With certain storage providers, Kubernetes resources must be scaled down to unmount the PVCs before you create the restore. In such scenarios, the volume-restore create command with the --skip-quiesce option can fail if pods are running with mounted PVCs. If this problem occurs, use the quiesce command with the --force option to scale down the resources, and rerun the volume-restore create command with the --skip-quiesce option. You can then scale up the Kubernetes resources after the restore by using the unquiesce command
    5. Manually scale up application Kubernetes resources:
      cpd-cli backup-restore unquiesce -n ${PROJECT_CPD_INSTANCE}
  3. To automatically scale down resources, restore volumes, and automatically scale up resources, do the following steps.
    1. Run the following volume restore command, specifying a restore name identifier.
      Note: The restore name identifier must consist of lowercase alphanumeric characters or the hyphen (-), and must start and end with an alphanumeric character. The underscore character (_) is not supported.
      cpd-cli backup-restore volume-restore create <restore_name> --from-backup <backup_name> -n ${PROJECT_CPD_INSTANCE}
    2. If the restore fails because there are completed or failed jobs, or pods, that reference PVCs, delete them, and rerun the restore command.
      Tip: Consider saving the job/pod yaml before you manually delete them, or include the --cleanup-completed-resources option in the restore command.
    3. If the restore does not automatically scale up resources because of a previous failure, manually scale up resources:
      cpd-cli backup-restore unquiesce -n ${PROJECT_CPD_INSTANCE}
  4. To check the status of a restore job, run the following command:
    cpd-cli backup-restore volume-restore status <restore_name> -n ${PROJECT_CPD_INSTANCE}
  5. To view a list of existing volume restores, run the following command:
    cpd-cli backup-restore volume-restore list -n ${PROJECT_CPD_INSTANCE}
  6. To get the logs of a volume restore, run the following command:
    cpd-cli backup-restore volume-restore logs <restore_name> -n ${PROJECT_CPD_INSTANCE}
  7. Optional: After the volume restore is complete, clean up cpd-cli backup-restore (delete the cpd-cli backup-restore deployment and other metadata) by running the following command:
    cpd-cli backup-restore reset -n ${PROJECT_CPD_INSTANCE} --force