Completing post-restore tasks for a Watson OpenScale online backup

After you restore Watson™ OpenScale from an online backup, you must complete more steps to verify that all of the features are restored.

Procedure

After you complete a nondisruptive restoration of a Watson OpenScale online backup instance, some of its features, such as scheduled or on-demand model evaluations, might not function properly. When a restoration finishes, you can use the following steps to verify that Watson OpenScale runs correctly:
  1. Log in to Red Hat OpenShift Container Platform with the following command:
    oc login <OpenShift_URL>:<port>
  2. Scale down the Watson OpenScale micro-service deployments with the following command:
    instanceProjectName='cpd-instance'
       instanceCRName='aiopenscale'
    
       oc scale deployment -n ${instanceProjectName} -l "component in (aios-bias,aios-bkpi,aios-drift,aios-explainability,aios-fast,aios-feedback,aios-ml,aios-mrm,aios-notification,aios-scheduling)" --replicas=0

    If you did not install Cloud Pak for Data in the cpd-instance project or use aiopenscale as the name of the Watson OpenScale custom resource, specify accurate values in the instanceProjectName and instanceCRName fields.

  3. Scale up the Watson OpenScale etcd cluster StatefulSet object and verify that all of the pods are in the Ready state with the following command:
      oc scale sts -n ${instanceProjectName} -l component=aios-etcd --replicas=3
       oc wait pod -n ${instanceProjectName} -l component=aios-etcd --for=condition=Ready --timeout=900s
  4. Log in to the Operator pod with the following command:
     oc project ibm-common-services
       OPERATOR_POD_NAME=$(oc get pods | grep wos | awk {'print $1'})
       oc exec --tty --stdin ${OPERATOR_POD_NAME} -- /bin/bash
  5. Set the values of the environment variables that are required to start restoration with the following command:
     instanceProjectName='cpd-instance'
       instanceCRName='aiopenscale'
    
       export ETCD_ENDPOINTS=https://${instanceCRName}-ibm-aios-etcd.${instanceProjectName}.svc.cluster.local:2379
       export ETCD_USER=root
       export ETCD_PASSWORD=`kubectl get secret ${instanceCRName}-ibm-aios-etcd-secrets -n ${instanceProjectName} -o jsonpath='{.data.etcd-root-password}' | base64 -d`
       export ETCD_CACERT_BASE64=`kubectl get secret internal-tls -n ${instanceProjectName} -o jsonpath='{.data.ca\.crt}'`
       export AIOS_GATEWAY_URL=https://${instanceCRName}-ibm-aios-nginx-internal.${instanceProjectName}
       export AIOS_SERVICE_CREDENTIALS=`kubectl get secret ibm-aios-icp4d-token -n ${instanceProjectName} -o jsonpath='{.data.token}' | base64 -d`
  6. Navigate to the files folder in the Operator by running the following command:
    cd roles/service/files
  7. Start the restoration by specifying the required arguments as shown in the following example:
     BACKUP_TIMESTAMP='2022-05-30T00:00:00.000Z'
       DATA_MART_IDS='00000000-0000-0000-0000-000000000000,00000000-0000-0000-0000-1652933046713104'
       ./wos_restore.sh -t ${BACKUP_TIMESTAMP} --delta 30 -i ${DATA_MART_IDS} -p

    You can specify the -h flag to get help to start the restoration. If you don't specify the -p flag, the restoration starts only in a preview mode. You can also save the output of the restoration process to a text or log file.

    The BACKUP_TIMESTAMP attribute is the ISO-8601 timestamp of the Watson OpenScale persistent volumes that back up the Watson OpenScale etcd cluster. You must specify the timestamp in the YYYY-MM-DDTHH:MM:SS.sssZ format.

    The DATA_MART_IDS attribute is a comma separated list of the target Watson OpenScale data mart identifiers that enable restoration. The value of a data mart identifier is the Watson OpenScale service instance identifier with the 00000000-0000-0000-0000- prefix. The default Watson OpenScaleservice instance includes a fixed 00000000-0000-0000-0000-000000000000 data mart ID.

    You can use the following commands to get a list of Watson OpenScale service instance identifiers:
    cpdAuthToken=`kubectl get secret ibm-aios-icp4d-token -n ${instanceProjectName} -o jsonpath='{.data.token}' | base64 -d`
    
       curl -s -k -H "Authorization: Bearer ${cpdAuthToken}" "https://internal-nginx-svc.${instanceProjectName}.svc:12443/zen-data/v3/service_instances?addon_type=aios&fetch_all_instances=true" | jq -r '.service_instances[] | [.id, .display_name] | @tsv'
    The command displays the list of Watson OpenScale service instance names and ID pairs as shown in the following example:
    1655797537073567	inst2
       1655691348195375	openscale-defaultinstance

    The value of the DATA_MART_IDS identifier is 00000000-0000-0000-0000-000000000000,00000000-0000-0000-0000-1655797537073567

  8. Run the exit command to exit the Operator pod.
  9. After the restoration finishes, restart the aios-redis and aios-configuration service pods with the following commands:
     oc delete pod -n ${instanceProjectName} -l app.kubernetes.io/component=aios-redis
       oc delete pod -n ${instanceProjectName} -l app.kubernetes.io/component=aios-configuration
  10. Scale up the Watson OpenScale micro-service deployments with the following command:
     oc scale deployment -n ${instanceProjectName} -l "component in (aios-bias,aios-bkpi,aios-drift,aios-explainability,aios-fast,aios-feedback,aios-ml,aios-mrm,aios-notification,aios-scheduling)" --replicas=1