Migrating data between Cloud Pak for Data installations

Important: IBM® Cloud Pak for Data Version 4.5 will reach end of support (EOS) on 31 July, 2025. For more information, see the Discontinuance of service announcement for IBM Cloud Pak for Data Version 4.X.

Upgrade to IBM Software Hub Version 5.1 before IBM Cloud Pak for Data Version 4.5 reaches end of support. For more information, see Upgrading IBM Software Hub in the IBM Software Hub Version 5.1 documentation.

Use the product data export and import utility to export data, including metadata, from one IBM Cloud Pak for Data installation and import the data to another Cloud Pak for Data installation.

The cpd-cli export-import command line interface can export and import IBM Cloud Pak for Data control plane data (including user accounts, roles, and global connections).

Who needs to complete this task?

A cluster administrator must run the command that initializes the export import utility.

An instance administrator can run all other cpd-cli export-import commands.

Before you run any cpd-cli export-import commands, ensure that:

The following commands use an example profile that is named default and an example Cloud Pak for Data project that is defined by the ${PROJECT_CPD_INSTANCE} environment variable.

Initializing the export-import command

Note: The following examples uses the recommended installation environment variables.

It is strongly recommended that you use a script to create environment variables with the correct values for your environment. For details, see Best practice: Setting up installation environment variables.

Red Hat® OpenShift® version 4.x:

cpd-cli export-import init \
--namespace=${PROJECT_CPD_INSTANCE} \
--arch=$(uname -m) \
--pvc-name=${PROJECT_CPD_INSTANCE}-pvc \
--profile=default \
--image-prefix=image-registry.openshift-image-registry.svc:5000/${PROJECT_CPD_INSTANCE}

Red Hat OpenShift version 3.11:

cpd-cli export-import init \
--namespace=${PROJECT_CPD_INSTANCE} \
--arch=$(uname -m) \
--pvc-name=${PROJECT_CPD_INSTANCE}-pvc \
--profile=default \
--image-prefix=docker-registry.default.svc:5000/${PROJECT_CPD_INSTANCE}

To list the registered auxiliary modules such as export and import, enter the following command:

cpd-cli export-import list aux-modules \
--namespace=${PROJECT_CPD_INSTANCE} \
--arch $(uname -m)

Exporting data with the export-import command

Note: The following examples uses the recommended installation environment variables.

It is strongly recommended that you use a script to create environment variables with the correct values for your environment. For details, see Best practice: Setting up installation environment variables.

To export data from Cloud Pak for Data to myexport1:

cpd-cli export-import export create myexport1\
--namespace=${PROJECT_CPD_INSTANCE} \
--profile=default \
--arch=$(uname -m) 

To check whether the myexport1 job succeeded, failed, or is still in active progress:

cpd-cli export-import export status myexport1 \
--namespace=${PROJECT_CPD_INSTANCE} \
--profile=default \

To export data from Cloud Pak for Data to myexport2 by using a scheduled export job at minute 0 past every 12th hour:

cpd-cli export-import schedule-export create myexport2 \
--namespace=${PROJECT_CPD_INSTANCE} \
--schedule="0 */12 * * *" \
--arch=$(uname -m) \
--profile=default

To check whether the scheduled myexport2 job succeeded, failed, or is still in active progress:

cpd-cli export-import schedule-export status myexport2 \
--namespace=${PROJECT_CPD_INSTANCE} \
--arch=$(uname -m) \
--profile=default

To retrieve the logs for the myexport1 export:

cpd-cli export-import export logs myexport1 \
--namespace=${PROJECT_CPD_INSTANCE} \
--arch=$(uname -m) \
--profile=default

To download the Cloud Pak for Data data as a TAR file archive to the current working directory:

cpd-cli export-import export download myexport1 \
--namespace=${PROJECT_CPD_INSTANCE} \
--arch=$(uname -m) \
--profile=default 

To upload the exported archive to a different cluster before you start the import (the target cluster must have the same cpd-cli export-import environment set up; after the upload is successful, you can import to the target cluster with the same namespace):

cpd-cli export-import export upload \
--namespace=${PROJECT_CPD_INSTANCE} \
--arch=$(uname -m) \
--profile=default 
--file=cpd-exports-myexport1-20200301101735-data.tar

To pass in override and custom values to export by using the --file flag to a specific auxiliary module (the top-level key must be the aux module name, cpdfwk.module):

cpd-cli export-import export create myexport1
--namespace ${PROJECT_CPD_INSTANCE} \
--arch=$(uname -m) \
--profile=default \
--file=overridevalues.yaml

Example content in the overridevalues.yaml file with the sample auxiliary module's specific key values:

sample-aux:
  pvc1: testpvc1
  pvc2: testpvc2

Importing data with the export-import command

Note: The following examples uses the recommended installation environment variables.

It is strongly recommended that you use a script to create environment variables with the correct values for your environment. For details, see Best practice: Setting up installation environment variables.

The export must be completed successfully before you can run an import. Because only one import job is allowed at a time, you must always delete the completed import job to start a new one.

To import Cloud Pak for Data data from the myexport1 example:

cpd-cli export-import import create myimport1
--from-export=myexport1 \
--namespace=${PROJECT_CPD_INSTANCE} \
--arch=$(uname -m) \
--profile=default

To import Cloud Pak for Data data from the scheduled myexport2 example:

cpd-cli export-import import create myimport1 \
--from-schedule=myexport2 \
--namespace=${PROJECT_CPD_INSTANCE} \
--arch=$(uname -m) \
--profile=default

To check whether the myimport1 job succeeded, failed, or is still in active progress:

cpd-cli export-import import status myimport1 \
--namespace=${PROJECT_CPD_INSTANCE} \
--arch=$(uname -m) \
--profile=default

Stopping export-import jobs

Note: The following examples uses the recommended installation environment variables.

It is strongly recommended that you use a script to create environment variables with the correct values for your environment. For details, see Best practice: Setting up installation environment variables.

To delete the myexport1 job in the ${PROJECT_CPD_INSTANCE} namespace without purging the exported data stored in the volume:

cpd-cli export-import export delete myexport1 \
--namespace=${PROJECT_CPD_INSTANCE} \
--arch=$(uname -m) \
--profile=default

To delete the myexport1 job and purge the exported data that is stored in the volume in the zen project:

cpd-cli export-import export delete myexport1\
--namespace=${PROJECT_CPD_INSTANCE} \
--arch=$(uname -m) \
--profile=default \
--purge

To delete the scheduled myexport2 job and purge the exported data that is stored in the volume in the zen project:

cpd-cli export-import schedule-export delete myexport2 \
--namespace=${PROJECT_CPD_INSTANCE} \
--arch=$(uname -m) \
--profile=default \
--purge

To delete the myimport1 job:

cpd-cli export-import import delete myimport1 \
--namespace=${PROJECT_CPD_INSTANCE} \
--arch=$(uname -m) \
--profile=default

To force cleanup any previous k8s resources that are created by cpd-cli export-import and use a different PVC:

cpd-cli export-import reset \
--namespace=${PROJECT_CPD_INSTANCE} \
--arch=$(uname -m) \
--profile=default \
--force \

cpd-cli export-import init \
--namespace=${PROJECT_CPD_INSTANCE} \
--pvc-name=pvc2 \
--profile=default \
--arch=$(uname -m)