Importing and exporting Watson Machine Learning Accelerator data
Import and export lets you transfer all data associated with a IBM Watson® Machine Learning Accelerator service deployment from a source cluster to a target cluster. The procedure can be used, for example, to move data from a development cluster to a production cluster or when in-place upgrades are not possible or desired.
- Training data
- Resource plan data (including custom resource plan data)
- Job history and history logs
- Metadata in the etcd datastore
- Watson Machine Learning Accelerator mongodb data (not including mongodb password credentials)
Supported scenarios for using import export:
- The import and export utility is only supported on clusters using the same hardware.
- The import and export utility is supported on clusters that have the same version of Watson Machine Learning Accelerator installed or if you are migrating Watson Machine Learning Accelerator from Cloud Pak for Data 4.5.2 or later to Cloud Pak for Data 4.7.
- The import and export utility is only supported on the same Watson Machine Learning Accelerator namespace.
- The import and export utility is supported on clusters that used the same installation method to install Watson Machine Learning Accelerator or if you are migrating Watson Machine Learning Accelerator from Cloud Pak for Data 4.5.2 or later to Cloud Pak for Data 4.7.
Limitations for import:
- Data can only be imported to a freshly installed Watson Machine Learning Accelerator cluster.
Limitations for export:
- Data in
mydatafsis not exported. Make sure to upload your user data again after importing.
- You must be an administrator of the Red Hat® OpenShift® project for your source and target clusters to import and export data for your Watson Machine Learning Accelerator service.
- Ensure the OpenShift client is available and has access to the cluster.
- Ensure that the helm v3 utility is installed on the client machine.
Before you begin
This topic assumes that the following prerequisites are met:
Prerequisites
- Ensure that you have access to the cluster and that you have a profile to use the cpd-cli management commands. See: Creating a profile to use the cpd-cli management commands
- Ensure that a shared volume PVC named
wmla-exim-pvcis available in the Watson Machine Learning Accelerator namespace to store the exported data. Make sure that is is ReadWriteMany and has enough disk space. See: Preparing to install the Cloud Pak for Data export and import utility - Ensure that the cpdtool image is installed. Ensure that the cpdtool image is accessible in the image registry from OpenShift. See: Installing the image for export and import utility
- Ensure that the export-import settings for the Watson Machine Learning Accelerator namespace are installed. See: Initializing the export-import command
Import and export topics
Downloading and installing the Watson Machine Learning Accelerator auxiliary component
Download and install the Watson Machine Learning Accelerator auxiliary component. Ensure that you meet the Prerequisites.
Download auxiliary component package
Download the Watson Machine Learning Accelerator auxiliary component from a public GitHub repository.
- Obtain the
wml-accelerator-aux-4.3.0.tgzfrom a public GitHub repository using the following curl command:curl -sL https://github.com/IBM/wmla-assets/releases/download/v4.7.x/wml-accelerator-aux-4.3.0-x86_64.tgz -o wml-accelerator-aux-4.3.0.tgzNote: To download earlier versions of the auxiliary component, see available versions on Github. - Check the version details of the downloaded file, run:
The downloaded file has the following details:helm show chart wml-accelerator-aux-4.3.0.tgzapiVersion: v1 appVersion: 4.3.0 description: A Helm chart for WML-accelerator Auxillary name: wml-accelerator-aux version: 4.3.0
Install auxiliary component package
- Install the helm chart using a helm v3 client, run:
- If Watson Machine Learning Accelerator is installed in the same project (namespace) as the Cloud
Pak for Data control
plane:
helm install wmla-aux ./wml-accelerator-aux-4.3.0.tgz -n ${PROJECT_CPD_INSTANCE}
- If Watson Machine Learning Accelerator is installed in the same project (namespace) as the Cloud
Pak for Data control
plane:
-
To verify that the helm chart is installed, run:
- If Watson Machine Learning Accelerator is installed in the same project (namespace) as the Cloud
Pak for Data control
plane:
cpd-cli export-import list aux-modules --namespace ${PROJECT_CPD_INSTANCE} --profile profile-name
If successfully installed, wmla-aux is listed in the output:ID NAME COMPONENT KIND VERSION ARCHITECTURE NAMESPACE VENDOR SI wmla-aux wml-accelerator-aux wml_accelerator exim 4.3.0 x86_64 cpd-instance ibm N - If Watson Machine Learning Accelerator is installed in the same project (namespace) as the Cloud
Pak for Data control
plane:
Exporting your data from a source cluster
Before exporting data
- Data in
mydatafsis not exported. Make sure to upload your user data again after importing.
To upload your user data again, see: Import data into Cloud Pak for Data to be used by Watson Machine Learning Accelerator
- Stop all Watson Machine Learning Accelerator workloads.
- Shutdown notebook server for all the users.
- Stop deployed inference model.
Exporting data
To export Watson Machine Learning Accelerator data and download exported data to your current working directory, complete the steps as follows:
- To export data from the Watson Machine Learning
Accelerator instance in
the Cloud Pak for Data namespace, run:
- If Watson Machine Learning Accelerator is installed in the same project (namespace) as the Cloud
Pak for Data control
plane:
cpd-cli export-import export create myexport1 --component wml_accelerator --namespace ${PROJECT_CPD_INSTANCE} --profile=profile-name --arch $(uname -m) --log-level=debug --verbose
- If Watson Machine Learning Accelerator is installed in the same project (namespace) as the Cloud
Pak for Data control
plane:
- Download the exported data as a tar file to the current working directory, run:
- If Watson Machine Learning Accelerator is installed in the same project (namespace) as the Cloud
Pak for Data control
plane:
cpd-cli export-import export download myexport1 -n ${PROJECT_CPD_INSTANCE} --profile=profile-name
- If Watson Machine Learning Accelerator is installed in the same project (namespace) as the Cloud
Pak for Data control
plane:
- Verify the status of an export operation:
- If Watson Machine Learning Accelerator is installed in the same project (namespace) as the Cloud
Pak for Data control
plane:
cpd-cli export-import export status myexport1 -n ${PROJECT_CPD_INSTANCE} --profile=profile-name
Name: myexport1 Job Name: cpd-ex-myexport1 Active: 0 Succeeded: 1 Failed: 0 Start Time: Thu, 30 Jun 2022 22:56:19 -0700 Completed At: Thu, 30 Jun 2022 22:59:21 -0700 Duration: 3m2s - If Watson Machine Learning Accelerator is installed in the same project (namespace) as the Cloud
Pak for Data control
plane:
- List all past export operations:
- If Watson Machine Learning Accelerator is installed in the same project (namespace) as the Cloud
Pak for Data control
plane:
cpd-cli export-import export list -n ${PROJECT_CPD_INSTANCE} --profile=profile-name
NAME KIND LAST MODIFIED LAST EXPORT bx1 exports 2022-07-01T05:56:27Z 20220701055627 myexport1 exports 2022-06-30T04:51:48Z 20220630045148 - If Watson Machine Learning Accelerator is installed in the same project (namespace) as the Cloud
Pak for Data control
plane:
Importing your data to a target cluster
Before importing data
- Data can only be imported to a freshly installed Watson Machine Learning Accelerator cluster.
Importing data
- Upload exported data:
- If Watson Machine Learning Accelerator is installed in the same project (namespace) as the Cloud
Pak for Data control
plane:
cpd-cli export-import export upload -f cpd-exports-myexport1-data.tar -n ${PROJECT_CPD_INSTANCE} --profile=import-profile
- If Watson Machine Learning Accelerator is installed in the same project (namespace) as the Cloud
Pak for Data control
plane:
- Check the export list:
cpd-cli export-import export list --profile=import-profile - Import data, for example, to import myimport1, run:
- If Watson Machine Learning Accelerator is installed in the same project (namespace) as the Cloud
Pak for Data control
plane:
cpd-cli export-import import create myimport1 --from-export myexport1 -n ${PROJECT_CPD_INSTANCE} --profile=import-profile
- If Watson Machine Learning Accelerator is installed in the same project (namespace) as the Cloud
Pak for Data control
plane:
- Verify the data importing status, for example for job myimport1, run:
- If Watson Machine Learning Accelerator is installed in the same project (namespace) as the Cloud
Pak for Data control
plane:
cpd-cli export-import import status myimport1 --profile profile-name -n ${PROJECT_CPD_INSTANCE}
- If Watson Machine Learning Accelerator is installed in the same project (namespace) as the Cloud
Pak for Data control
plane:
Verify imported data
- Check that all pods are running.
- Open the Watson Machine Learning Accelerator console and list history jobs.
- Verify that your deployed models are running. If needed, restart them and check the status again to ensure they are running. See: View deployed models in Watson Machine Learning Accelerator
- Submit a test job.