Preparing for migration in IBM Cloud Pak for Data

Before you import any data to Cloud Pak for Data, complete a set of setup tasks.

Tasks to complete on the Cloud Pak for Data system before migrating

Installing the migration toolkit

Download and install the latest version of the migration toolkit for Cloud Pak for Data.

  1. Download the migration toolkit for Cloud Pak for Data from this support page:

    Migration from IBM InfoSphere Server to IBM Knowledge Catalog: Applying patches and toolkit to a new IBM Knowledge Catalog 4.8.x or 5.0.x installation (Part 2 of 2)

    This document is updated when a new version of the toolkit is released and also contains information about any prerequisite patches that you might need to install.

  2. Unpack the downloaded file and follow the installation instructions in the support page from where you downloaded the file.

Installing the latest version of the cpd-cli command-line interface

Ensure that you have the export and import utility set up. The latest version of the cpd-cli command-line interface (CLI) and related modules must be installed. For more information, Installing the Cloud Pak for Data command-line interface (cpd-cli).

Setting up the export-import utility

The export-import utility needs to be set up for migration. Follow the sections in Migrating data between instances of Cloud Pak for Data to ensure that the utility is properly set up and initialized:
After the initialization is complete, run the following command to check if the utility is set up for migration:
cpd-cli export-import list aux-modules --namespace=${NAMESPACE} --profile=${PROFILE_NAME} --arch=${CPU_ARCH}

Setting environment variables

Set the following environment variables.
CP4D_HOST=<cp4d host>
CP4D_USERNAME=<default platform administrator user: cpadmin or admin>
CP4D_PASSWORD=<cp4d password>

Increasing the expiry time for tokens

If a large amount of data is to be migrated, increase the expiry time for tokens before you start the migration.
  1. Check whether a token expiry time is set by running the following command:
    oc get configmap product-configmap --namespace=${NAMESPACE} -o custom-columns=test:.data.TOKEN_EXPIRY_TIME | sed '1d'

    If the parameter TOKEN_EXPIRY_TIME is set, note the original setting to be able to reset the value after the migration is complete.

  2. Change the TOKEN_EXPIRY_TIME setting to a large value, such as 48 hours, by running the following command:
    oc patch configmap product-configmap --namespace=${NAMESPACE} --type=merge --patch="{\"data\": {\"TOKEN_EXPIRY_TIME\": \"48\"}}"
  3. Check whether a token refresh period is set by running the following command:
    oc get configmap product-configmap --namespace=${NAMESPACE} -o custom-columns=test:.data.TOKEN_REFRESH_PERIOD | sed '1d'

    If the parameter TOKEN_REFRESH_PERIOD is set, note the original setting to be able to reset the value after the migration is complete.

  4. Change the TOKEN_REFRESH_PERIOD setting to a large value, such as 48 hours, by running the following command:
    oc patch configmap product-configmap --namespace=${NAMESPACE} --type=merge --patch="{\"data\": {\"TOKEN_REFRESH_PERIOD\": \"48\"}}"
  5. Restart the usermgmt pods by running the following command:
    oc delete pods --namespace=${NAMESPACE} -l component=usermgmt
Remember to revert this change after the migration is complete.

Increasing the ephemeral storage for the wkc-data-rules pod

To avoid eviction of the wkc-data-rules pod when migrating data quality rules, increase the pod's ephemeral storage. You can revert this setting after migration or continue working with the increased storage settings.
  1. Log in to the Red Hat® OpenShift® Container Platform as a user with administrator rights.
  2. Check whether a value for the ephemeral-storage parameter is set for the wkc-data-rules pod and whether the value is less than 2Gi.
    oc get deployment wkc-data-rules --output="jsonpath={.spec.template.spec.containers[*].resources.limits.ephemeral-storage}" && echo -e "\n"
  3. If the ephemeral-storage parameter values is less than 2Gi, run the following command to set the value to 2Gi:
    oc patch wkc wkc-cr -n ${NAMESPACE} --type merge -p '{"spec":{"wkc_data_rules_resources":{"requests":{"cpu":"100m","memory":"800Mi","ephemeral-storage":"50Mi"},"limits":{"cpu":1,"memory":"2048Mi","ephemeral-storage": "2Gi" }}}}'

Providing connection information for storing rule output

A platform connection to the location and datasource to which the data rule output tables can be persisted must exist prior to migration.

Create a platform connection for storing rule output tables and save the connection name to the DQ_RULES_CONNECTION_NAME variable. To create such connection, complete the steps described in Connecting to data sources at the platform level. The connection must be configured to use shared credentials.

Only Db2, Oracle, or Microsoft SQL Server data sources are supported.

The connection details must be made available as input to the export on the InfoSphere Information Server system.

If the InfoSphere Information Server engine tier and Cloud Pak for Data system can communicate over port 443, the information is later retrieved as part of the export process. For more information, see step 5 in Exporting data from the InfoSphere Information Server system.

If the InfoSphere Information Server engine tier and Cloud Pak for Data system cannot communicate over port 443, you must retrieve the details of the connection for storing the output of data rules to a JSON file and transfer that file to the InfoSphere Information Server system. To create the dq_rule_connection.json file, run the following command:
${TOOLKIT_PATH}/get_dq_rule_connection.sh --url https://${CP4D_HOST} -u ${CP4D_USERNAME} -p ${CP4D_PASSWORD} -name ${DQ_RULES_CONNECTION_NAME}

Transfer this file to the InfoSphere Information Server system as part of the export process. For more information, see step 5 in Exporting data from the InfoSphere Information Server system.

What to do next

Now, you are ready to migrate user information from the InfoSphere Information Server system to Cloud Pak for Data.