Post-restore tasks after restoring an offline backup

For some services, additional tasks must be done after you restore Cloud Pak for Data from an offline backup.

Restoring services that do not support offline backup and restore

The following list shows the services that don't support offline backup and restore. If any of these services are installed in your Cloud Pak for Data deployment, do the appropriate steps to make them functional after a restore.

Db2® Data Gate
Db2 Data Gate synchronizes Db2 for z/OS® data in real time. After Cloud Pak for Data is restored, data might be out of sync from Db2 for z/OS. It is recommended that you re-add tables after Cloud Pak for Data foundational services are restored.
EDB Postgres
The service must be reinstalled and then the data must be restored. For more information about restoring EDB Postgres, see Performing physical backup and restore for the EDB Postgres service.
MongoDB
The service must be deleted and reinstalled. Recreate the instance as a new instance, and then restore the data with MongoDB tools. For more information, see Installing the MongoDB service and Back Up and Restore with MongoDB Tools.
Watson™ Assistant
The service must be cleaned up and reinstalled. Complete the following steps:
  1. Before you backed up your Cloud Pak for Data deployment, you identified the name of the Watson Assistant installation (instance) in the ${PROJECT_CPD_INSTANCE} project. Locate that information and use it to complete the next step.
  2. Set the WA_INSTANCE_NAME variable:
    export WA_INSTANCE_NAME=<instance-name>
  3. Run the following commands, and then delete the resources that are listed after each command:
    oc get wa ${WA_INSTANCE_NAME}
    oc get watsongateway,watsonassistantstore,watsonassistantdialog,watsonassistantui,watsonassistantclu,watsonassistantanalytics,watsonassistantintegrations,watsonassistantrecommends | grep /${WA_INSTANCE_NAME}
    oc get dataexhaust,dataexhausttenant,modeltraindynamicworkflow,miniocluster,redissentinel,formation.redis,cluster,elasticsearchcluster,rabbitmqcluster,kafka,etcdcluster | grep ${WA_INSTANCE_NAME}-
    oc get job,deploy,replicaset,pod,StatefulSet,configmap,persistentvolumeclaim,poddisruptionbudget,horizontalpodautoscaler,networkpolicies,cronjobs | grep ${WA_INSTANCE_NAME}-
  4. Reinstall the Watson Assistant instance.

    For more information, see Installing Watson Assistant.

Watson Discovery

The service must be uninstalled, reinstalled, then the data restored.

Watson Speech services
The service is functional and you can re-import data. For more information, see Importing and exporting data.

Restoring services that use Apache Spark to the same cluster or to a different cluster

When a service that uses Apache Spark is restored to the same cluster or to a different cluster, the volume pod (pod that starts with volumes-volume-name-deploy-xxxx) must be restarted. For example, this restart is needed when you are using Analytics Engine Powered by Apache Spark as your choice of Spark runtime for evaluating batch models with Watson OpenScale.

Do the following steps:

  1. Log in to Red Hat® OpenShift® Container Platform as a user with sufficient permissions to complete the task:
    oc login OpenShift_URL:port
  2. Change to the project where Analytics Engine Powered by Apache Spark was installed:
    oc project Project
  3. Look for pods that start with volumes:
    oc get pods | grep volumes*
  4. Delete the volume pod that matches the name of the volume where you uploaded the wos_env.zip file in the source target and restored it into the target cluster:
    oc delete pod volumes-volume-name-deploy-xxxx
  5. Check that the pod is created and it is up and running:
    oc get pods | grep volumes-volume-name-deploy

Restoring Watson OpenScale to a different cluster

In addition to the steps described in Restoring services that use Apache Spark to the same cluster or to a different cluster, when you are restoring Watson OpenScale to a different cluster, the cpd-external-route key in *-aios-service-secrets must be updated to point to the target cluster external route. This update is needed to make sure that the Watson OpenScale Dashboard URL that is included in the notification email and the published metrics CSV is valid and correct.

Do the following steps:

  1. Log in to Red Hat OpenShift Container Platform as a user with sufficient permissions to complete the task:
    oc login OpenShift_URL:port
  2. Change to the project where Analytics Engine Powered by Apache Spark was installed:
    oc project Project
  3. Describe the mrm pod and look for the secretKeyRef.name set for the ICP_ROUTE environment variable:
    oc get pods | grep aios-mrm
    oc describe pod <pod-name>
    For example:
    oc describe pod aiopenscale-ibm-aios-mrm-87c9d8bfc-8fr6t
     - name: ICP_ROUTE
          valueFrom:
            secretKeyRef:
              key: cpd-external-route
              name: aiopenscale-ibm-aios-service-secrets
  4. Get the value of the cpd-external-route key from the secret:
    oc get secret secret-name -o jsonpath={.data.cpd-external-route} | base64 -d

    The output of this command is the source cluster hostname. Change it to target server host name in encoded format.

  5. Update the cpd-external-route key in the secret:
    oc edit secret secret-name

    This command opens a vi editor. Replace the cpd-external-route value with the base64 encoded value of the target cluster, and save the secret by exiting the vi editor. You can obtain the encoded value of the target cluster URL by using the base64 command or by using the Base64 Encode and Decode website.

  6. Restart the mrm pod:
    oc delete pod mrm-pod-name
    Note: The oc delete pod command brings up a new pod. Make sure that pod is back up and running.

The Spark connection (also called integrated_system in Watson OpenScale) that is created for Analytics Engine Powered by Apache Spark must be updated with the new apikey of the target cluster. For more information, see Step 2 in Configuring the batch processor in Watson OpenScale.

Verifying the Watson Machine Learning restore operation

After restoring from a backup, users might be unable to deploy new models and score existing models. To resolve this issue, after the restore operation, wait until operator reconciliation completes. You can check the status of the operator with the following commands:
export PROJECT_WML=<wml-namespace>
kubectl describe WmlBase wml-cr -n ${PROJECT_WML} | grep "Wml Status" | awk '{print $3}'
After backup and restore operations, before using Watson Machine Learning, make sure that the wml-cr is in completed state and all the wml pods are in running state. Use this command to check that all wml pods are in running state:
oc get pods -n <wml-namespace> -l release=wml

Restoring owner references to Watson Machine Learning Accelerator resources

Complete additional steps to restore owner references to all Watson Machine Learning Accelerator resources, see: Backing up and restoring Watson Machine Learning Accelerator.

Restoring Data Replication

After Cloud Pak for Data is restored, do the following steps.

  1. If you restored Cloud Pak for Data to a different cluster, stop the replication on the source cluster to avoid having two streams of data flowing from the same data source to the same destination when the service is restarted on the restored cluster.
  2. Connect to the restored Cloud Pak for Data instance.
  3. Go to the restored replications and stop them.
  4. Restart the replications.