Creating Cloud Pak for Data volume snapshots

Important: IBM Cloud Pak® for Data Version 4.8 will reach end of support (EOS) on 31 July, 2025. For more information, see the Discontinuance of service announcement for IBM Cloud Pak for Data Version 4.X.

Upgrade to IBM Software Hub Version 5.1 before IBM Cloud Pak for Data Version 4.8 reaches end of support. For more information, see Upgrading from IBM Cloud Pak for Data Version 4.8 to IBM Software Hub Version 5.1.

If you are using Portworx storage, you can back up all persistent volumes (PVs) in your IBM Cloud Pak for Data deployment by creating volume snapshots with the Cloud Pak for Data volume backup and restore utility.

Before you begin

To create snapshots, the cpd-cli backup-restore command-line interface requires a cluster administrator or similar role that is able to create, read, write, and delete Stork CRDs and other Kubernetes resources, such as deployments, StatefulSets, cronjobs, jobs, replicasets, configmaps, secrets, pods, namespaces, persistent volume claims (PVCs), and PVs.

To run snapshot-related commands, your cluster must also meet the following requirements:

  • The minimum version of Portworx that Cloud Pak for Data supports. For more information, see Storage considerations.

    To check the Portworx version, run the following commands:

    PX_POD=$(oc get pods -l name=portworx -n kube-system -o jsonpath='{.items[0].metadata.name}')
    oc exec -it $PX_POD -n kube-system -- /opt/pwx/bin/pxctl --version
  • Stork 2.3.3 or later.

    To check the Stork version, run the following commands:

    STORK_POD=$(oc get pods -n kube-system -l name=stork -o jsonpath='{.items[0].metadata.name}')
    oc exec -it $STORK_POD -n kube-system -- /storkctl/linux/storkctl version
Note: You can create snapshots only of the Cloud Pak for Data instance project (namespace), like zen. You cannot create snapshots of other Cloud Pak for Data projects (for example, the project where the operators for the Cloud Pak for Data instance are installed).

About this task

The cpd-cli backup-restore command-line interface creates a snapshot of the Portworx PVCs in your system at a particular moment in time. The interface backs up and restores volume data in the same project and installation, and assumes that Kubernetes objects are still in place.

Important: Backing up persistent volumes alone is not sufficient for disaster recovery purposes because Kubernetes objects like secrets are needed along with volume data to restore applications in a project.
Best practice: You can run the commands in this task exactly as written if you set up environment variables. For instructions, see Setting up installation environment variables.

Ensure that you source the environment variables before you run the commands in this task.

For more information about the Cloud Pak for Data volume backup and restore utility, including a list of commands that you can run, see the cpd-cli backup-restore reference documentation.

Procedure

  1. Create a local volume snapshot, specifying the snapshot name.
    Note: The snapshot name must consist of lowercase alphanumeric characters or the hyphen (-), and must start and end with an alphanumeric character. The underscore character (_) is not supported.
    cpd-cli backup-restore snapshot create <snapshot_name> -n ${PROJECT_CPD_INST_OPERANDS}
  2. To check the status of a snapshot, run the following command:
    cpd-cli backup-restore snapshot status <snapshot_name> -n ${PROJECT_CPD_INST_OPERANDS}
  3. To view a list of existing snapshots, run the following command:
    cpd-cli backup-restore snapshot list -n ${PROJECT_CPD_INST_OPERANDS}
  4. If you stopped all Data Refinery runtimes and jobs before you created the backup, restart the service by running the following command.

    The value of <number_of_replicas> depends on the scaleConfig setting when Data Refinery was installed (1 for small, 3 for medium, and 4 for large).

    oc scale --replicas=<number_of_replicas> deploy wdp-shaper wdp-dataprep