Importing and exporting Watson Machine Learning Accelerator data
Import and export lets you transfer all data associated with a IBM Watson® Machine Learning Accelerator service deployment from a source cluster to a target cluster. The procedure can be used, for example, to move data from a development cluster to a production cluster or when in-place upgrades are not possible or desired.
- Training jobs (including batch jobs and HPO jobs)
- Resource plan data (including custom resource plan data)
- Job history and history logs
- HPO data
- EDI models and runtime
Supported scenarios for using import export:
- The import and export utility is only supported on clusters using the same hardware.
- The export utility is supported if you are migrating Watson Machine Learning Accelerator from Cloud Pak for Data 4.7 or later.
- The import and export utility is supported on the same Watson Machine Learning Accelerator namespace as Cloud Pak for Data or a different namespace.
Limitations for import:
- Data can only be imported to a freshly installed Watson Machine Learning Accelerator cluster.
- Users are not imported. After importing, make sure to create new users with the same roles as the exported cluster to allow users to retrieve their existing data.
- After migration, you must resubmit any previously pending jobs. Pending jobs are deleted after migration.
Limitations for export:
- Data in
mydatafsis not exported. Make sure to upload your user data again after importing.
- You must be an administrator of the Red Hat® OpenShift® project for your source and target clusters to import and export data for your Watson Machine Learning Accelerator service.
- Ensure the OpenShift client is available and has access to the cluster.
- Ensure that the helm v3 utility is installed on the client machine.
Before you begin
This topic assumes that the following prerequisites are met:
Prerequisites
- Ensure that you have access to the cluster and that you have a profile to use the cpd-cli management commands. See: Creating a profile to use the cpd-cli management commands
- Ensure that a shared volume PVC named
wmla-exim-pvcis available in the Watson Machine Learning Accelerator namespace to store the exported data. Make sure that is is ReadWriteMany and has enough disk space. See: Preparing to install the Cloud Pak for Data export and import utility - Ensure that the cpdtool image is installed. Ensure that the cpdtool image is accessible in the image registry from OpenShift. See: Installing the image for export and import utility
- Ensure that the export-import settings for the Watson Machine Learning Accelerator namespace are installed. See: Initializing the export-import command
Import and export topics
Downloading and installing the Watson Machine Learning Accelerator auxiliary component
Download and install the Watson Machine Learning Accelerator auxiliary component. Ensure that you meet the Prerequisites.
Download auxiliary component package
Download the Watson Machine Learning Accelerator auxiliary component from a public GitHub repository.
- Obtain the
wml-accelerator-aux-5.0.2.tgzfrom a public GitHub repository using the following curl command:curl -sL https://github.com/IBM/wmla-assets/releases/download/v5.0.2/wml-accelerator-aux-5.0.2-x86_64.tgz -o wml-accelerator-aux-5.0.2.tgzNote: To download current versions of the auxiliary component, see Github. - Check the version details of the downloaded file,
run:
The downloaded file has the following details:helm show chart wml-accelerator-aux-5.0.2.tgzapiVersion: v1 appVersion: 5.0.2 description: A Helm chart for WML-accelerator Auxillary name: wml-accelerator-aux version: 5.0.2
Install auxiliary component package
- Install the helm chart using a helm v3 client,
run:
helm install wmla-aux ./wml-accelerator-aux-5.0.2.tgz -n ${PROJECT_CPD_INSTANCE} -
To verify that the helm chart is installed, run:
cpd-cli export-import list aux-modules --namespace ${PROJECT_CPD_INSTANCE} --profile profile-nameIf successfully installed, wmla-aux is listed in the output:ID NAME COMPONENT KIND VERSION ARCHITECTURE NAMESPACE VENDOR SI wmla-aux wml-accelerator-aux wml_accelerator exim 5.0.2 x86_64 cpd-instance ibm N
Exporting your data from a source cluster
Before exporting data
- Data in
mydatafsis not exported. Make sure to upload your user data again after importing.
To upload your user data again, see: Importing data.
- Stop all Watson Machine Learning Accelerator workloads.
- Shutdown notebook server for all the users.
- Stop deployed inference model.
Exporting data
To export Watson Machine Learning Accelerator data and download exported data to your current working directory, complete the steps as follows:
- To export data from the Watson Machine Learning
Accelerator instance in
the Cloud Pak for Data namespace, run:
cpd-cli export-import export create myexport1 --component wml_accelerator --namespace ${PROJECT_CPD_INSTANCE} --profile=profile-name --arch x86_64 --log-level=debug --verbose - Download the exported data as a tar file to the current working directory, run:
cpd-cli export-import export download myexport1 -n ${PROJECT_CPD_INSTANCE} --profile=profile-name
- Verify the status of an export operation:
Example output:cpd-cli export-import export status myexport1 -n ${PROJECT_CPD_INSTANCE} --profile=profile-nameName: myexport1 Job Name: cpd-ex-myexport1 Active: 0 Succeeded: 1 Failed: 0 Start Time: Thu, 30 Jun 2022 22:56:19 -0700 Completed At: Thu, 30 Jun 2022 22:59:21 -0700 Duration: 3m2s - List all past export
operations:
Example output:cpd-cli export-import export list -n ${PROJECT_CPD_INSTANCE} --profile=profile-nameNAME KIND LAST MODIFIED LAST EXPORT bx1 exports 2022-07-01T05:56:27Z 20220701055627 myexport1 exports 2022-06-30T04:51:48Z 20220630045148
Importing your data to a target cluster
Before importing data
- Data can only be imported to a freshly installed Watson Machine Learning Accelerator cluster.
- Users are not imported. After importing, make sure to create new users with the same roles as the exported cluster to allow users to retrieve their existing data.
- After migration, you must resubmit any previously pending jobs. Pending jobs are deleted after migration.
Importing data
- Upload exported
data:
cpd-cli export-import export upload -f cpd-exports-myexport1-data.tar -n ${PROJECT_CPD_INSTANCE} --profile=import-profile - Check the export list:
cpd-cli export-import export list --profile=import-profile - Import data, for example, to import myimport1,
run:
cpd-cli export-import import create myimport1 --from-export myexport1 -n ${PROJECT_CPD_INSTANCE} --profile=import-profile - Verify the data importing status, for example for job myimport1,
run:
cpd-cli export-import import status myimport1 --profile profile-name -n ${PROJECT_CPD_INSTANCE}
Verify imported data
- Check that all pods are running.
- Use the Watson Machine Learning Accelerator command tool dlicmd.py to list job
history:
dlicmd.py --app-history-get-all and dlicmd.py --app-get-all - Use dlicmd.py to view a list of resource plans:
dlicmd.py --rp-get-all - Use dlicmd.py to view a list of HPO jobs:
dlicmd.py --hpo-get-all - Use dlicmd.py to view a list of HPO algorithms:
dlicmd.py --hpo-algorithm-get-all - Verify that your deployed models are running. If needed, restart them and check the status again to ensure they are running. See: View deployed models in Watson Machine Learning Accelerator
- Submit a test job.