Importing and exporting Watson Machine Learning Accelerator data

Import and export lets you transfer all data associated with a IBM Watson® Machine Learning Accelerator service deployment from a source cluster to a target cluster. The procedure can be used, for example, to move data from a development cluster to a production cluster or when in-place upgrades are not possible or desired.

The Watson Machine Learning Accelerator auxiliary component is used to export and import Watson Machine Learning Accelerator data from a source deployment to a target deployment. The auxiliary component works directly to transfer the following data:
  • Training data
  • Resource plan data (including custom resource plan data)
  • Job history and history logs
  • Metadata in the etcd datastore
  • Watson Machine Learning Accelerator mongodb data (not including mongodb password credentials)
Note: To use the metadata export and import utility, the Watson Machine Learning Accelerator auxiliary component must be installed. To install Watson Machine Learning Accelerator auxiliary component, the helm v3 client must be installed on the client machine.

Supported scenarios for using import export:

The import and export utilities are supported on clusters that meet the following conditions:
  • The import and export utility is only supported on clusters using the same hardware.
  • The import and export utility is supported on clusters that have the same version of Watson Machine Learning Accelerator installed or if you are migrating Watson Machine Learning Accelerator from Cloud Pak for Data 4.5.2 or later to Cloud Pak for Data 4.7.
  • The import and export utility is only supported on the same Watson Machine Learning Accelerator namespace.
  • The import and export utility is supported on clusters that used the same installation method to install Watson Machine Learning Accelerator or if you are migrating Watson Machine Learning Accelerator from Cloud Pak for Data 4.5.2 or later to Cloud Pak for Data 4.7.

Limitations for import:

Before importing your data, consider the following:
  • Data can only be imported to a freshly installed Watson Machine Learning Accelerator cluster.

Limitations for export:

Before exporting your data, consider the following:
  • Data in mydatafs is not exported. Make sure to upload your user data again after importing.
Permissions you need for these tasks:
  • You must be an administrator of the Red Hat® OpenShift® project for your source and target clusters to import and export data for your Watson Machine Learning Accelerator service.
  • Ensure the OpenShift client is available and has access to the cluster.
  • Ensure that the helm v3 utility is installed on the client machine.

Before you begin

This topic assumes that the following prerequisites are met:

Prerequisites

Import and export topics

Downloading and installing the Watson Machine Learning Accelerator auxiliary component

Download and install the Watson Machine Learning Accelerator auxiliary component. Ensure that you meet the Prerequisites.

Download auxiliary component package

Download the Watson Machine Learning Accelerator auxiliary component from a public GitHub repository.

To download the Watson Machine Learning Accelerator 4.3.0 auxiliary component:
  1. Obtain the wml-accelerator-aux-4.3.0.tgz from a public GitHub repository using the following curl command:
    curl -sL https://github.com/IBM/wmla-assets/releases/download/v4.7.x/wml-accelerator-aux-4.3.0-x86_64.tgz -o wml-accelerator-aux-4.3.0.tgz
    Note: To download earlier versions of the auxiliary component, see available versions on Github.
  2. Check the version details of the downloaded file, run:
    helm show chart  wml-accelerator-aux-4.3.0.tgz
    The downloaded file has the following details:
    apiVersion: v1
    appVersion: 4.3.0
    description: A Helm chart for WML-accelerator Auxillary
    name: wml-accelerator-aux
    version: 4.3.0

Install auxiliary component package

Install the Watson Machine Learning Accelerator auxiliary component helm chart:
  1. Install the helm chart using a helm v3 client, run:
    • If Watson Machine Learning Accelerator is installed in the same project (namespace) as the Cloud Pak for Data control plane:
      helm install wmla-aux ./wml-accelerator-aux-4.3.0.tgz -n ${PROJECT_CPD_INSTANCE}
  2. To verify that the helm chart is installed, run:
    • If Watson Machine Learning Accelerator is installed in the same project (namespace) as the Cloud Pak for Data control plane:
      cpd-cli export-import list aux-modules --namespace ${PROJECT_CPD_INSTANCE} --profile profile-name
    If successfully installed, wmla-aux is listed in the output:
    ID      	NAME               	COMPONENT      	KIND	VERSION	ARCHITECTURE	NAMESPACE   	VENDOR	SI
    wmla-aux	wml-accelerator-aux	wml_accelerator	exim	4.3.0  	x86_64      	cpd-instance	ibm   	N

Exporting your data from a source cluster

Before exporting data

Before exporting your data, consider the following:
  • Data in mydatafs is not exported. Make sure to upload your user data again after importing.

To upload your user data again, see: Import data into Cloud Pak for Data to be used by Watson Machine Learning Accelerator

Before exporting data, make sure to do the following:
  1. Stop all Watson Machine Learning Accelerator workloads.
  2. Shutdown notebook server for all the users.
  3. Stop deployed inference model.

Exporting data

To export Watson Machine Learning Accelerator data and download exported data to your current working directory, complete the steps as follows:

  1. To export data from the Watson Machine Learning Accelerator instance in the Cloud Pak for Data namespace, run:
    • If Watson Machine Learning Accelerator is installed in the same project (namespace) as the Cloud Pak for Data control plane:
      cpd-cli export-import export create myexport1 --component wml_accelerator --namespace ${PROJECT_CPD_INSTANCE} --profile=profile-name --arch $(uname -m) --log-level=debug --verbose
  2. Download the exported data as a tar file to the current working directory, run:
    • If Watson Machine Learning Accelerator is installed in the same project (namespace) as the Cloud Pak for Data control plane:
      cpd-cli export-import export download myexport1 -n ${PROJECT_CPD_INSTANCE} --profile=profile-name
You can also perform the following operations:
  • Verify the status of an export operation:
    • If Watson Machine Learning Accelerator is installed in the same project (namespace) as the Cloud Pak for Data control plane:
      cpd-cli export-import export status myexport1 -n ${PROJECT_CPD_INSTANCE} --profile=profile-name
    Example output:
    Name:      	myexport1
    Job Name:  	cpd-ex-myexport1
    Active:      	0
    Succeeded:   	1
    Failed:      	0
    Start Time:  	Thu, 30 Jun 2022 22:56:19 -0700
    Completed At:	Thu, 30 Jun 2022 22:59:21 -0700
    Duration:    	3m2s
  • List all past export operations:
    • If Watson Machine Learning Accelerator is installed in the same project (namespace) as the Cloud Pak for Data control plane:
      cpd-cli export-import export list -n ${PROJECT_CPD_INSTANCE} --profile=profile-name
    Example output:
    NAME     	KIND   	LAST MODIFIED       	LAST EXPORT
    bx1      	exports	2022-07-01T05:56:27Z	20220701055627
    myexport1	exports	2022-06-30T04:51:48Z	20220630045148

Importing your data to a target cluster

Before importing data

Before importing your data, consider the following:
  • Data can only be imported to a freshly installed Watson Machine Learning Accelerator cluster.

Importing data

To upload exported data to a new cluster:
  1. Upload exported data:
    • If Watson Machine Learning Accelerator is installed in the same project (namespace) as the Cloud Pak for Data control plane:
      cpd-cli export-import export upload -f cpd-exports-myexport1-data.tar -n ${PROJECT_CPD_INSTANCE} --profile=import-profile
  2. Check the export list:
    cpd-cli export-import export list --profile=import-profile
  3. Import data, for example, to import myimport1, run:
    • If Watson Machine Learning Accelerator is installed in the same project (namespace) as the Cloud Pak for Data control plane:
      cpd-cli export-import import create myimport1 --from-export myexport1 -n ${PROJECT_CPD_INSTANCE} --profile=import-profile
  4. Verify the data importing status, for example for job myimport1, run:
    • If Watson Machine Learning Accelerator is installed in the same project (namespace) as the Cloud Pak for Data control plane:
      cpd-cli export-import import status myimport1 --profile profile-name -n ${PROJECT_CPD_INSTANCE}

Verify imported data

To verify that your data was imported, use any of the following methods:
  • Check that all pods are running.
  • Open the Watson Machine Learning Accelerator console and list history jobs.
  • Verify that your deployed models are running. If needed, restart them and check the status again to ensure they are running. See: View deployed models in Watson Machine Learning Accelerator
  • Submit a test job.