Migrating data between Cloud Pak for Data installations
Upgrade to IBM Software Hub Version 5.1 before IBM Cloud Pak for Data Version 4.8 reaches end of support. For more information, see Upgrading from IBM Cloud Pak for Data Version 4.8 to IBM Software Hub Version 5.1.
Use the product data export and import utility to export data, including metadata, from one IBM Cloud Pak for Data installation and import the data to another Cloud Pak for Data installation.
The cpd-cli
export-import
command line interface can export and import Cloud Pak for Data service-specific data (when the services
support the command). For more information, see Services that support cpd-cli export-import.
- Who needs to complete this task?
-
Cluster administrator A cluster administrator must run the command that initializes the export import utility.
Instance administrator An instance administrator can run all other
cpd-cli export-import
commands.
Before you begin
Complete the following tasks before you run the cpd-cli
export-import
commands:
- Set up a client workstation to install Cloud Pak for Data.
- Create a profile to use the management commands.
- Prepare to use the export and import utility.
Ensure that you source the environment variables before you run the commands in this task.
Initializing the export import utility
You must initialize the export import utility before you run any cpd-cli
export-import
commands.
- Set the
CPU_ARCH
environment variable based on the hardware on your Red Hat OpenShift Container Platform cluster:- For x86-64 hardware,
run:
export CPU_ARCH=x86_64
- For Power® hardware,
run:
export CPU_ARCH=ppc64le
- For x86-64 hardware,
run:
- Set the
CPD_PROFILE_NAME
environment variable to the name of the profile that you created in Creating a profile to use the management commands.export CPD_PROFILE_NAME=<my-profile-name>
- Run the appropriate command for your environment:
The cluster pulls images from a private container registry
Restriction: This option is available only if an administrator completed Installing the Cloud Pak for Data command-line interface (cpd-cli
).cpd-cli export-import init \ --namespace=${PROJECT_CPD_INST_OPERANDS} \ --arch=${CPU_ARCH} \ --pvc-name=export-import-pvc \ --profile=${CPD_PROFILE_NAME} \ --image-prefix=${PRIVATE_REGISTRY_LOCATION}
The cluster pulls images from the IBM Entitled Registry
Restriction: This option is available only if the cluster can connect to the internet.cpd-cli export-import init \ --namespace=${PROJECT_CPD_INST_OPERANDS} \ --arch=${CPU_ARCH} \ --pvc-name=export-import-pvc \ --profile=${CPD_PROFILE_NAME} \ --image-prefix=icr.io/cpopen/cpd
List the available auxiliary modules
When you install a service that uses the cpd-cli
export-import
commands, the service installs a service-specific
auxiliary module. For details, see Services that support cpd-cli export-import.
Run the following command to determine which auxiliary modules are installed:
cpd-cli export-import list aux-modules \
--namespace=${PROJECT_CPD_INST_OPERANDS} \
--profile=${CPD_PROFILE_NAME} \
--arch=${CPU_ARCH}
Exporting data
The following commands provide several examples of how you can export data from an instance of Cloud Pak for Data.
- Export data from Cloud Pak for Data to myexport1:
-
cpd-cli export-import export create myexport1 \ --namespace=${PROJECT_CPD_INST_OPERANDS} \ --profile=${CPD_PROFILE_NAME} \ --arch=${CPU_ARCH}
- Check whether the myexport1 job succeeded, failed, or is still in active progress:
-
cpd-cli export-import export status myexport1 \ --namespace=${PROJECT_CPD_INST_OPERANDS} \ --profile=${CPD_PROFILE_NAME} \ --arch=${CPU_ARCH}
- Retrieve the logs for the myexport1 export:
-
cpd-cli export-import export logs myexport1 \ --namespace=${PROJECT_CPD_INST_OPERANDS} \ --profile=${CPD_PROFILE_NAME} \ --arch=${CPU_ARCH}
- Export data from Cloud Pak for Data to myexport2 by using a scheduled export job at minute 0 past every 12th hour:
-
cpd-cli export-import schedule-export create myexport2 \ --namespace=${PROJECT_CPD_INST_OPERANDS} \ --schedule="0 */12 * * *" \ --profile=${CPD_PROFILE_NAME} \ --arch=${CPU_ARCH}
- Check whether the scheduled myexport2 job succeeded, failed, or is still in active progress:
-
cpd-cli export-import schedule-export status myexport2 \ --namespace=${PROJECT_CPD_INST_OPERANDS} \ --profile=${CPD_PROFILE_NAME} \ --arch=${CPU_ARCH}
Downloading exported data
The following command provides an example of how you can download the data that you exported so that you can migrate the data to another cluster. The exported data is saved to a compressed file.
- Download data from Cloud Pak for Data to a TAR file in current working directory:
-
cpd-cli export-import export download myexport1 \ --namespace=${PROJECT_CPD_INST_OPERANDS} \ --profile=${CPD_PROFILE_NAME} \ --arch=${CPU_ARCH}
Uploading exported data
The following command provides an example of how you can upload the contents of the compressed export file to a different cluster (the target cluster).
cpd-cli
export-import
utility must be installed and initialized on the target
cluster before you upload the exported data. - Upload data from a compressed TAR file
-
cpd-cli export-import export upload \ --namespace=${PROJECT_CPD_INST_OPERANDS} \ --profile=${CPD_PROFILE_NAME} \ --arch=${CPU_ARCH} \ --file=cpd-exports-myexport1-20200301101735-data.tar
Importing data
The export must be completed successfully before you can run an import. Because only one import job is allowed at a time, you must always delete the completed import job to start a new one.
To import Cloud Pak for Data data from the myexport1 example:
cpd-cli export-import import create myimport1 \
--from-export=myexport1 \
--namespace=${PROJECT_CPD_INST_OPERANDS} \
--profile=${CPD_PROFILE_NAME} \
--arch=${CPU_ARCH}
To import Cloud Pak for Data data from the scheduled myexport2 example:
cpd-cli export-import import create myimport1 \
--from-schedule=myexport2 \
--namespace=${PROJECT_CPD_INST_OPERANDS} \
--profile=${CPD_PROFILE_NAME} \
--arch=${CPU_ARCH}
To check whether the myimport1 job succeeded, failed, or is still in active progress:
cpd-cli export-import import status myimport1 \
--namespace=${PROJECT_CPD_INST_OPERANDS} \
--profile=${CPD_PROFILE_NAME} \
--arch=${CPU_ARCH} \
Stopping export-import jobs
To delete the myexport1 job in the ${PROJECT_CPD_INST_OPERANDS} namespace without purging the exported data stored in the volume:
cpd-cli export-import export delete myexport1 \
--namespace=${PROJECT_CPD_INST_OPERANDS} \
--profile=${CPD_PROFILE_NAME} \
--arch=${CPU_ARCH}
To delete the myexport1 job and purge the exported data that is stored in the
volume in the ${PROJECT_CPD_INST_OPERANDS}
project:
cpd-cli export-import export delete myexport1 \
--namespace=${PROJECT_CPD_INST_OPERANDS} \
--profile=${CPD_PROFILE_NAME} \
--arch=${CPU_ARCH} \
--purge
To delete the scheduled myexport2 job and purge the exported data that is
stored in the volume in the ${PROJECT_CPD_INST_OPERANDS}
project:
cpd-cli export-import schedule-export delete myexport2 \
--namespace=${PROJECT_CPD_INST_OPERANDS} \
--profile=${CPD_PROFILE_NAME} \
--arch=${CPU_ARCH} \
--purge
To delete the myimport1 job:
cpd-cli export-import import delete myimport1 \
--namespace=${PROJECT_CPD_INST_OPERANDS} \
--profile=${CPD_PROFILE_NAME} \
--arch=${CPU_ARCH}
To force cleanup any previous k8s resources that are created by cpd-cli
export-import
and use a different PVC:
cpd-cli export-import reset \
--namespace=${PROJECT_CPD_INST_OPERANDS} \
--profile=${CPD_PROFILE_NAME} \
--arch=${CPU_ARCH} \
--force
cpd-cli export-import init \
--image-prefix=${PRIVATE_REGISTRY_LOCATION}/${PROJECT_CPD_INST_OPERANDS} \
--namespace=${PROJECT_CPD_INST_OPERANDS} \
--pvc-name=pvc2 \
--profile=${CPD_PROFILE_NAME} \
--arch=${CPU_ARCH}