Backing up and restoring Cloud Pak for Data

IBM® Cloud Pak for Data supports online and offline backup and restore.

Online backups
During an online backup, normal runtime operations in the Cloud Pak for Data cluster continue while the backup is taken. Container Storage Interface (CSI) volume snapshots of Kubernetes metadata and volume data are taken with minimal disruption.
Offline backups
During an offline backup, Cloud Pak for Data services are quiesced to bring them to a consistent state. At the beginning of the backup process, each service stops using its data volumes until the backup is completed. External operations for services that participate in the backup are interrupted for the entire duration of the quiesce, backup, and unquiesce steps.

You can create offline backups in the following ways:

  • Create CSI volume snapshots of Kubernetes metadata and volume data.
  • Create Restic backups on an S3-compatible object store of Kubernetes metadata and volume data.
  • If you are using Portworx storage, create snapshots of volume data.
  • Create backups of volume data on a separate Persistent Volume Claim (PVC) or S3-compatible object store.

Choosing which backup method to use

To help you decide which backup method to choose, consider the following factors.

  Online backups Offline backups
Supported Cloud Pak for Data releases Cloud Pak for Data 4.5.x CSI volume snapshots and Restic backups: Cloud Pak for Data 4.0.2 and later

Volume snapshots and volume backups: Cloud Pak for Data 3.5.x and later

Storage types Red Hat® OpenShift® Data Foundation

IBM Spectrum® Scale

Red Hat OpenShift Data Foundation

IBM Spectrum Scale

NFS

Portworx

Productivity You can take frequent backups without sacrificing productivity. The cluster is put in quiesce mode, which effectively means shutting down the cluster and disrupting business.
Disaster recovery

4.5.3 or later You can restore online backups to a different cluster as part of a disaster recovery plan when you are running IBM Cloud Pak for Data 4.5.3 or later.

You can restore offline backups to a different cluster as part of a disaster recovery plan.
Recommendation: Choose to create online backups if you are using a storage type that is supported by online backups, and the services that are installed also support online backups.

Supported backup and restore scenarios

The following backup and restore scenarios are supported:

  • Online backup and restore of a Cloud Pak for Data instance project (namespace) on the same cluster, by using CSI snapshots
  • Online backup and restore of a Cloud Pak for Data deployment (IBM Cloud Pak® foundational services and IBM Cloud Pak for Data platform operator project, Cloud Pak for Data instance project) to a different cluster, by using CSI snapshots
  • Offline backup and restore of a Cloud Pak for Data instance project on the same cluster, by using CSI snapshots or Restic backups
  • Offline backup and restore of a Cloud Pak for Data deployment (IBM Cloud Pak foundational services and IBM Cloud Pak for Data platform operator project, Cloud Pak for Data instance project) to a different cluster, by using Restic backups
  • Offline backup and restore of a Cloud Pak for Data instance project's volume data to the same instance on the same cluster, by using volume snapshots or backups

Some Cloud Pak for Data services do not support online backups or offline backups, or both. Many of these services have their own backup and restore process. However, after a restore operation, you might have to clean up, delete, and reinstall some of these services.

Migrating data between Cloud Pak for Data installations

In addition to backing up and restoring Cloud Pak for Data, you can export a service's data and metadata from one Cloud Pak for Data installation and import the data to another Cloud Pak for Data installation. For more information, see Migrating data between Cloud Pak for Data installations.