Importing and exporting Watson Machine Learning Accelerator data

Import and export lets you transfer all data associated with a IBM Watson® Machine Learning Accelerator service deployment from a source cluster to a target cluster. The procedure can be used, for example, to move data from a development cluster to a production cluster or when in-place upgrades are not possible or desired.

The Watson Machine Learning Accelerator auxiliary component is used to export and import Watson Machine Learning Accelerator data from a source deployment to a target deployment. The auxiliary component works directly to transfer the following data:
  • Training jobs (including batch jobs and HPO jobs)
  • Resource plan data (including custom resource plan data)
  • Job history and history logs
  • HPO data
  • EDI models and runtime
Note: To use the metadata export and import utility, the Watson Machine Learning Accelerator auxiliary component must be installed. To install Watson Machine Learning Accelerator auxiliary component, the helm v3 client must be installed on the client machine.

Supported scenarios for using import export:

The import and export utilities are supported on clusters that meet the following conditions:
  • The import and export utility is only supported on clusters using the same hardware.
  • The export utility is supported if you are migrating Watson Machine Learning Accelerator from Cloud Pak for Data 4.7 or later.
  • The import and export utility is supported on the same Watson Machine Learning Accelerator namespace as Cloud Pak for Data or a different namespace.

Limitations for import:

Before importing your data, consider the following:
  • Data can only be imported to a freshly installed Watson Machine Learning Accelerator cluster.
  • Users are not imported. After importing, make sure to create new users with the same roles as the exported cluster to allow users to retrieve their existing data.
  • After migration, you must resubmit any previously pending jobs. Pending jobs are deleted after migration.

Limitations for export:

Before exporting your data, consider the following:
  • Data in mydatafs is not exported. Make sure to upload your user data again after importing.
Permissions you need for these tasks:
  • You must be an administrator of the Red Hat® OpenShift® project for your source and target clusters to import and export data for your Watson Machine Learning Accelerator service.
  • Ensure the OpenShift client is available and has access to the cluster.
  • Ensure that the helm v3 utility is installed on the client machine.

Before you begin

This topic assumes that the following prerequisites are met:

Prerequisites

Import and export topics

Downloading and installing the Watson Machine Learning Accelerator auxiliary component

Download and install the Watson Machine Learning Accelerator auxiliary component. Ensure that you meet the Prerequisites.

Download auxiliary component package

Download the Watson Machine Learning Accelerator auxiliary component from a public GitHub repository.

To download the Watson Machine Learning Accelerator auxiliary component:
  1. Obtain the wml-accelerator-aux-5.0.2.tgz from a public GitHub repository using the following curl command:
    curl -sL https://github.com/IBM/wmla-assets/releases/download/v5.0.2/wml-accelerator-aux-5.0.2-x86_64.tgz -o wml-accelerator-aux-5.0.2.tgz
    Note: To download current versions of the auxiliary component, see Github.
  2. Check the version details of the downloaded file, run:
    helm show chart  wml-accelerator-aux-5.0.2.tgz
    The downloaded file has the following details:
    apiVersion: v1
    appVersion: 5.0.2
    description: A Helm chart for WML-accelerator Auxillary
    name: wml-accelerator-aux
    version: 5.0.2

Install auxiliary component package

Install the Watson Machine Learning Accelerator auxiliary component helm chart:
  1. Install the helm chart using a helm v3 client, run:
    helm install wmla-aux ./wml-accelerator-aux-5.0.2.tgz -n ${PROJECT_CPD_INSTANCE}
  2. To verify that the helm chart is installed, run:
    cpd-cli export-import list aux-modules --namespace ${PROJECT_CPD_INSTANCE} --profile profile-name
    If successfully installed, wmla-aux is listed in the output:
    ID      	NAME               	COMPONENT      	KIND	VERSION	ARCHITECTURE	NAMESPACE   	VENDOR	SI
    wmla-aux	wml-accelerator-aux	wml_accelerator	exim	5.0.2  	x86_64      	cpd-instance	ibm   	N

Exporting your data from a source cluster

Before exporting data

Before exporting your data, consider the following:
  • Data in mydatafs is not exported. Make sure to upload your user data again after importing.

To upload your user data again, see: Importing data.

Before exporting data, make sure to do the following:
  1. Stop all Watson Machine Learning Accelerator workloads.
  2. Shutdown notebook server for all the users.
  3. Stop deployed inference model.

Exporting data

To export Watson Machine Learning Accelerator data and download exported data to your current working directory, complete the steps as follows:

  1. To export data from the Watson Machine Learning Accelerator instance in the Cloud Pak for Data namespace, run:
    cpd-cli export-import export create myexport1 --component wml_accelerator --namespace ${PROJECT_CPD_INSTANCE} --profile=profile-name --arch x86_64 --log-level=debug --verbose
  2. Download the exported data as a tar file to the current working directory, run:
    cpd-cli export-import export download myexport1 -n ${PROJECT_CPD_INSTANCE} --profile=profile-name
You can also perform the following operations:
  • Verify the status of an export operation:
    cpd-cli export-import export status myexport1 -n ${PROJECT_CPD_INSTANCE} --profile=profile-name
    Example output:
    Name:      	myexport1
    Job Name:  	cpd-ex-myexport1
    Active:      	0
    Succeeded:   	1
    Failed:      	0
    Start Time:  	Thu, 30 Jun 2022 22:56:19 -0700
    Completed At:	Thu, 30 Jun 2022 22:59:21 -0700
    Duration:    	3m2s
  • List all past export operations:
    cpd-cli export-import export list -n ${PROJECT_CPD_INSTANCE} --profile=profile-name
    Example output:
    NAME     	KIND   	LAST MODIFIED       	LAST EXPORT
    bx1      	exports	2022-07-01T05:56:27Z	20220701055627
    myexport1	exports	2022-06-30T04:51:48Z	20220630045148

Importing your data to a target cluster

Before importing data

Before importing your data, consider the following:
  • Data can only be imported to a freshly installed Watson Machine Learning Accelerator cluster.
  • Users are not imported. After importing, make sure to create new users with the same roles as the exported cluster to allow users to retrieve their existing data.
  • After migration, you must resubmit any previously pending jobs. Pending jobs are deleted after migration.

Importing data

To upload exported data to a new cluster:
  1. Upload exported data:
    cpd-cli export-import export upload -f cpd-exports-myexport1-data.tar -n ${PROJECT_CPD_INSTANCE} --profile=import-profile
  2. Check the export list:
    cpd-cli export-import export list --profile=import-profile
  3. Import data, for example, to import myimport1, run:
    cpd-cli export-import import create myimport1 --from-export myexport1 -n ${PROJECT_CPD_INSTANCE} --profile=import-profile
  4. Verify the data importing status, for example for job myimport1, run:
    cpd-cli export-import import status myimport1 --profile profile-name -n ${PROJECT_CPD_INSTANCE}

Verify imported data

To verify that your data was imported, use any of the following methods:
  • Check that all pods are running.
  • Use the Watson Machine Learning Accelerator command tool dlicmd.py to list job history:
    dlicmd.py --app-history-get-all and dlicmd.py --app-get-all
  • Use dlicmd.py to view a list of resource plans:
    dlicmd.py --rp-get-all
  • Use dlicmd.py to view a list of HPO jobs:
    dlicmd.py --hpo-get-all
  • Use dlicmd.py to view a list of HPO algorithms:
     dlicmd.py --hpo-algorithm-get-all 
  • Verify that your deployed models are running. If needed, restart them and check the status again to ensure they are running. See: View deployed models in Watson Machine Learning Accelerator
  • Submit a test job.