Offline backup and restore to a different cluster with the IBM Software Hub OADP utility
A Red Hat® OpenShift® Container Platform cluster administrator can create an offline backup and restore it to a different cluster with the IBM Software Hub OADP utility.
Before you begin
Do the following tasks before you back up and restore a IBM Software Hub deployment.
- Check whether the services that you are using support platform backup and
restore by reviewing Services that support backup and restore. You can also run the following
command:
cpd-cli oadp service-registry check \ --tenant-operator-namespace ${PROJECT_CPD_INST_OPERATORS} \ --verbose \ --log-level debugIf a service is not supported, check if one of the following alternatives is available:
- A service might have its own backup and restore process. In such cases, see the links that are provided in the Notes column in Services that support backup and restore.
- You might be able to migrate service data from one IBM Software Hub installation and to another IBM Software Hub installation with the data export and import utility. Review Services that support cpd-cli export-import.
- Install the software that is needed to back up and restore IBM Software Hub with the OADP utility.
For more information, see Installing backup and restore software.
- Check that your IBM Software Hub deployment meets the following requirements:
- The minimum deployment profile of IBM Cloud Pak foundational services is
Small.For more information about sizing IBM Cloud Pak foundational services, see Hardware requirements and recommendations for foundational services.
- All services are installed at the same IBM Software Hub release.
You cannot back up and restore a deployment that is running service versions from different IBM Software Hub releases.
- The control plane is installed in a single project (namespace).
- The IBM Software Hub instance is installed in zero or more tethered projects.
- IBM Software Hub operators and the IBM Software Hub instance are in a good state.
- The minimum deployment profile of IBM Cloud Pak foundational services is
Overview
You can create Restic backups on an S3-compatible object store. Restic is a file system copying technique that is used by OpenShift APIs for Data Protection (OADP), based on the Restic open source project. Under OADP, Restic backups support producing backups only to S3-compatible object stores.
Backing up an IBM Software Hub deployment and restoring it to the same cluster involves the following high-level steps:
- Preparing to back up IBM Software Hub
- Creating an offline backup
- Preparing to restore IBM Software Hub
- Restoring IBM Software Hub to a different cluster
- Completing post-restore tasks
1. Preparing to back up IBM Software Hub
Complete the following prerequisite tasks before you create an offline backup. Some tasks are service-specific, and need to be done only when those services are installed.
1.1 Creating environment variables
Create the following environment variables so that you can copy commands from the documentation and run them without making any changes.
| Environment variable | Description |
|---|---|
OC_LOGIN |
Shortcut for the oc login command. |
CPDM_OC_LOGIN |
Shortcut for the cpd-cli manage login-to-ocp command. |
PROJECT_CPD_INST_OPERATORS |
The project where the IBM Software Hub instance operators are installed. |
PROJECT_CPD_INST_OPERANDS |
The project where IBM Software Hub control plane and services are installed. |
PROJECT_SCHEDULING_SERVICE |
The project where the scheduling service is installed. This environment variable is needed only when the scheduling service is installed. |
PROJECT_CPD_INSTANCE_TETHERED_LIST |
The list of tethered projects. This environment variable is needed only when some services are installed in tethered projects. |
PROJECT_CPD_INSTANCE_TETHERED |
The tethered project where a service is installed. This environment variable is needed only when a service is installed in a tethered project. |
OADP_PROJECT |
The project (namespace) where OADP is installed. |
TENANT_OFFLINE_BACKUP_NAME |
The name that you want to use for the offline backup. |
1.2 Checking the version of OADP utility components
- Check that the OADP operator version is
1.4.x:
oc get csv -A | grep "OADP Operator" - Check that the cpd-cli oadp version is
5.1.0:
cpd-cli oadp version
1.3 Optional: Estimating how much storage to allocate for backups
You can estimate the amount of storage that you need to allocate for backups.
To use this feature, you must install the cpdbr-agent in the Red Hat
OpenShift cluster. The cpdbr-agent
deploys the node agents to the cluster. The node agents must be run in privileged mode.
- Log in to Red Hat
OpenShift Container Platform as a cluster
administrator:
${OC_LOGIN}Remember:OC_LOGINis an alias for theoc logincommand. - Install the
cpdbr-agentby running the following command:cpd-cli oadp install --component=cpdbr-agent --namespace=${OADP_PROJECT} --cpd-namespace=${PROJECT_CPD_INST_OPERANDS} - Export the following environment
variable:
export CPDBR_ENABLE_FEATURES=volume-util - Estimate how much storage you need to allocate to a backup by running the following
command:
cpd-cli oadp du-pv
1.4 Removing MongoDB-related ConfigMaps
oc delete cm zen-cs-aux-br-cmoc delete cm zen-cs-aux-ckpt-cmoc delete cm zen-cs-aux-qu-cmoc delete cm zen-cs2-aux-ckpt-cm1.5 Checking the primary instance of every PostgreSQL cluster is in sync with its replicas
The replicas for Cloud Native PostgreSQL and EDB Postgres clusters occasionally get out of sync with the primary node. To check whether this problem exists and to fix the problem, see the troubleshooting topic PostgreSQL cluster replicas get out of sync.
1.6 Excluding external volumes from IBM Software Hub offline backups
You can exclude external Persistent Volume Claims (PVCs) in the IBM Software Hub instance project (namespace) from offline backups.
You might want to exclude PVCs that were manually created in the IBM Software Hub project (namespace) but are not needed by IBM Software Hub services. These volumes might be too large for a backup, or they might already be backed up by other means.
Optionally, you can choose to include PVC YAML definitions in the offline backup, and exclude only the contents of the volumes.
- Log in to Red Hat
OpenShift Container Platform as a cluster
administrator.
${OC_LOGIN}Remember:OC_LOGINis an alias for theoc logincommand. - For backups that are created by using Container Storage Interface (CSI) snapshots, do one of the following choices:
- To exclude a PVC YAML definition and the contents of the volume in a backup, label the PVC to
exclude with the Velero exclude
label:
oc label pvc <pvc-name> velero.io/exclude-from-backup=true - To include a PVC YAML definition in a backup but exclude the contents of the volume, apply the
following label to the
PVC:
oc label pvc <volume-name> icpdsupport/empty-on-backup=true
- To exclude a PVC YAML definition and the contents of the volume in a backup, label the PVC to
exclude with the Velero exclude
label:
- To exclude both the PVC YAML definition and the contents of the volume in backups that are
created by using Restic, do the following steps.
- Label the PVC to exclude with the Velero exclude
label:
oc label pvc <pvc-name> velero.io/exclude-from-backup=true - Label any pods that mount the PVC with the exclude label.
In the PVC describe output, look for pods in
Mounted By. For each pod, add the label:oc describe pvc <pvc-name>oc label po <pod-name> velero.io/exclude-from-backup=true
- Label the PVC to exclude with the Velero exclude
label:
- To include the PVC YAML definition and exclude the contents of the volume in backups that are
created by using Restic, apply the following label to the
PVC:
oc label pvc <volume-name> icpdsupport/empty-on-backup=true
1.7 Updating the Common core services ConfigMap
5.1.0 You might need to update the cpd-ccs-maint-br-cm ConfigMap before you create a backup. Do the following steps:
- Check if any common core services download images
pod is in a
Runningstate:oc get po -l icpdsupport/addOnId=ccs,icpdsupport/module=ccs-common,app=download-images -n ${PROJECT_CPD_INST_OPERANDS} - If the output of the command shows one or more pods in a
Runningstate, edit themanaged-resourcessection in the cpd-ccs-maint-br-cm ConfigMap to ignore the pod:aux-meta: managed-resources: - resource-kind: pod labels: icpdsupport/addOnId=ccs,icpdsupport/module=ccs-common,app=download-images
1.8 Deleting Analytics Engine powered by Apache Spark runtime deployments
oc get deploy -n ${PROJECT_CPD_INST_OPERANDS} | grep 'spark-master\|spark-worker' | awk '{print $1}' | xargs oc delete deploy -n ${PROJECT_CPD_INST_OPERANDS}1.9 Stopping Data Refinery runtimes and jobs
-
Log in to Red Hat OpenShift Container Platform as a cluster administrator.
${OC_LOGIN}Remember:OC_LOGINis an alias for theoc logincommand. - To stop all active Data Refinery
runtimes and jobs, run the following
commands:
oc delete $(oc get deployment -l type=shaper -o name) oc delete $(oc get svc -l type=shaper -o name) oc delete $(oc get job -l type=shaper -o name) oc delete $(oc get secrets -l type=shaper -o name) oc delete $(oc get cronjobs -l type=shaper -o name) oc scale -\-replicas=0 deploy wdp-shaper wdp-dataprep
1.10 Preparing Db2
-
Log in to Red Hat OpenShift Container Platform as a cluster administrator.
${OC_LOGIN}Remember:OC_LOGINis an alias for theoc logincommand. - Retrieve the names of the IBM Software Hub
deployment's Db2U
clusters:
oc get db2ucluster -A -ojsonpath='{.items[?(@.spec.environment.dbType=="db2oltp")].metadata.name}' - For each Db2U
cluster, do the following substeps:
- Export the Db2U cluster
name:
export DB2UCLUSTER=<db2ucluster_name> - Label the
cluster:
oc label db2ucluster ${DB2UCLUSTER} db2u/cpdbr=db2u --overwrite - Verify that the Db2U cluster now
contains the new
label:
oc get db2ucluster ${DB2UCLUSTER} --show-labels
- Export the Db2U cluster
name:
- For
each Db2U cluster, if Q Replication is
enabled, stop Q Replication by doing the following steps.
- Get the Q Replication pod
name:
oc get po -n ${PROJECT_CPD_INST_OPERANDS} | grep ${DB2UCLUSTER} | grep qrep - Exec into the Q Replication
pod:
oc exec -it <qrep-pod-name> bash -n ${PROJECT_CPD_INST_OPERANDS} - Log in as the
dsadminuser:su - dsadm - 5.1.0-5.1.1 Stop the Q Replication monitoring
process:
>nohup $BLUDR_HOME/scripts/bin/bludr-monitor-qrep-components-wrapper-utils.sh stop > /dev/null & - Stop Q
Replication:
$BLUDR_HOME/scripts/bin/bludr-stop.shWhen the script has finished running, the following messages appear:Stopping bludr replication instance ... Stopping replication ... REPLICATION ENDED SAFELY Stopping BLUDR WLP server... Stopping replication REST server instance ... SERVER STATUS: INACTIVE
- Get the Q Replication pod
name:
1.11 Preparing Db2 Warehouse
-
Log in to Red Hat OpenShift Container Platform as a cluster administrator.
${OC_LOGIN}Remember:OC_LOGINis an alias for theoc logincommand. - Retrieve the names of the IBM Software Hub
deployment's Db2U
clusters:
oc get db2ucluster -A -ojsonpath='{.items[?(@.spec.environment.dbType=="db2wh")].metadata.name}' - For each Db2U
cluster, do the following substeps:
- Export the Db2U cluster
name:
export DB2UCLUSTER=<db2ucluster_name> - Label the
cluster:
oc label db2ucluster ${DB2UCLUSTER} db2u/cpdbr=db2u --overwrite - Verify that the Db2U cluster now
contains the new
label:
oc get db2ucluster ${DB2UCLUSTER} --show-labels
- Export the Db2U cluster
name:
- For
each Db2U cluster, if Q Replication is
enabled, stop Q Replication by doing the following steps.
- Get the Q Replication pod
name:
oc get po -n ${PROJECT_CPD_INST_OPERANDS} | grep ${DB2UCLUSTER} | grep qrep - Exec into the Q Replication
pod:
oc exec -it <qrep-pod-name> bash -n ${PROJECT_CPD_INST_OPERANDS} - Log in as the
dsadminuser:su - dsadm - 5.1.0-5.1.1 Stop the Q Replication monitoring
process:
>nohup $BLUDR_HOME/scripts/bin/bludr-monitor-qrep-components-wrapper-utils.sh stop > /dev/null & - Stop Q
Replication:
$BLUDR_HOME/scripts/bin/bludr-stop.shWhen the script has finished running, the following messages appear:Stopping bludr replication instance ... Stopping replication ... REPLICATION ENDED SAFELY Stopping BLUDR WLP server... Stopping replication REST server instance ... SERVER STATUS: INACTIVE
- Get the Q Replication pod
name:
1.12 Labeling the IBM Match 360 ConfigMap
mdm label. Do the following steps:- Get the ID of the IBM
Match 360 instance:
- From the IBM Software Hub home page, go to .
- Click the link for the IBM Match 360 instance.
- Copy the value after mdm- in the URL.
For example, if the end of the URL is
mdm-1234567891123456, the instance ID is1234567891123456.
- Create the following environment
variable:
export INSTANCE_ID=<instance-id> - Add the
mdmlabel by running the following command:oc label cm mdm-operator-${INSTANCE_ID} icpdsupport/addOnId=mdm -n ${PROJECT_CPD_INST_OPERANDS}
1.13 Updating the RStudio Server Runtimes backup and restore ConfigMap
5.1.2 and later Update the RStudio® Server Runtimes backup and restore ConfigMap by doing the following steps:
- Create the rstudio-br-patch.sh file.Note: Use only spaces (and not tabs) in the file.
vi rstudio-br-patch.shoc -n ${PROJECT_CPD_INST_OPERANDS} get cm cpd-rstudio-maint-aux-br-cm -o jsonpath='{.data.plan-meta}' > plan-meta.yaml sed -i '44d;48,50d' plan-meta.yaml sed -i '44i\ sequence: ' plan-meta.yaml sed -i '45i\ - group: rstudio-clusterroles ' plan-meta.yaml sed -i '46i\ - group: rstudio-crs ' plan-meta.yaml echo " sequence: []" >> plan-meta.yaml echo "data:" > plan-meta-patch.yaml echo " plan-meta: |" >> plan-meta-patch.yaml sed 's/^/ /' plan-meta.yaml >> plan-meta-patch.yaml oc patch -n ${PROJECT_CPD_INST_OPERANDS} cm cpd-rstudio-maint-aux-br-cm --type=merge --patch-file plan-meta-patch.yaml - Put the RStudio Server
Runtimes service in maintenance
mode and wait until the RStudio Server
Runtimes custom
resources are in the
InMaintenancestate:oc patch -n ${PROJECT_CPD_INST_OPERANDS} rstudioaddon rstudio-cr --type=merge -p '{"spec": {"ignoreForMaintenance":true}}' oc -n ${PROJECT_CPD_INST_OPERANDS} get rstudio -w - Run the rstudio-br-patch.sh
file:
bash rstudio-br-patch.shWhen the script has finished running, the ConfigMap is updated, and you see the following message:configmap/cpd-rstudio-maint-aux-br-cm patched - Remove the RStudio Server
Runtimes service from
maintenance
mode:
oc patch -n ${PROJECT_CPD_INST_OPERANDS} rstudioaddon rstudio-cr --type=merge -p '{"spec": {"ignoreForMaintenance":false}}' oc -n ${PROJECT_CPD_INST_OPERANDS} get rstudio -w
1.14 Stopping SPSS Modeler runtimes and jobs
-
Log in to Red Hat OpenShift Container Platform as a cluster administrator.
${OC_LOGIN}Remember:OC_LOGINis an alias for theoc logincommand. - To stop all active SPSS
Modeler runtimes and
jobs, run the following
command:
oc delete rta -l type=service,job -l component=spss-modeler - To check whether any SPSS
Modeler runtime
sessions are still running, run the following
command:
oc get pod -l type=spss-modelerWhen no pods are running, no output is produced for this command.
1.15 Backing up Watson Discovery data separately
Before you back up a cluster where the Watson Discovery service is installed, back up the Watson Discovery data separately by running the Watson Discovery backup script. For more information, see Backing up and restoring data.
1.16 Scaling down watsonx.ai deployments
5.1.0 If watsonx.ai™ is installed, manually scale down the following deployments.
-
Log in to Red Hat OpenShift Container Platform as a cluster administrator.
${OC_LOGIN}Remember:OC_LOGINis an alias for theoc logincommand. - Run the following
command:
oc scale deploy caikit-runtime-stack-operator -n ${PROJECT_CPD_INST_OPERATORS} --replicas=0
1.17 Preparing watsonx Code Assistant for Z
5.1.2 and later and later: If watsonx Code Assistant™ for Z is installed, do the following steps:
- If watsonx Code Assistant for Z includes a GPU node,
taint the worker node.
- Find the GPU
node:
oc get node -L nvidia.com/gpu.replicas | grep -oP '.*[\d]$' | cut -f1 -d' ' - For each node, make it not preferable to schedule on so that only GPU workloads go
there:
oc adm taint nodes workerX special=true:PreferNoSchedule
- Find the GPU
node:
- Because the IBM large language model (LLM) is more than 75GB, expand the
minio-storage-pvc PVC size in the Velero project to
100GB.
oc patch pvc minio-storage-pvc -n velero --type='merge' -p '{"spec":{"resources":{"requests":{"storage":"100Gi"}}}}' - Improve the startup performance of the catalog-api-jobs job by increasing
the startup probe initial delay to
300s.
oc patch deployment catalog-api-jobs -n ${PROJECT_CPD_INST_OPERANDS} --type='json' -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/startupProbe/initialDelaySeconds", "value": 300}]'
1.18 Checking the status of installed services
Completed. Do the following steps:-
Log the
cpd-cliin to the Red Hat OpenShift Container Platform cluster:${CPDM_OC_LOGIN}Remember:CPDM_OC_LOGINis an alias for thecpd-cli manage login-to-ocpcommand. - Run the following command to get the status of all
services.
cpd-cli manage get-cr-status \ --cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS}
2. Creating an offline backup
Create an offline backup of a IBM Software Hub deployment by doing the following tasks.
2.1 Setting the mode in which to create backups
You can run the IBM Software Hub OADP backup and restore utility in Kubernetes mode or in REST mode.
By default, the IBM Software Hub OADP backup and restore utility runs in Kubernetes mode. In this mode, you must log in to your Red Hat OpenShift cluster and you must have Kubernetes cluster administrator privileges to use the utility.
If you installed the IBM Software Hub OADP backup REST service, you can run the utility in REST mode to create backups. In REST mode, the utility runs as a REST client that communicates to a REST server. The REST service is configured to work with a specific IBM Software Hub instance. You do not have to log in to the cluster, and IBM Software Hub users with the Administrator role can run backup and checkpoint commands on their own IBM Software Hub instances, based on the specified control plane and any tethered projects.
Running the utility in REST mode is useful when you are generally creating backups only, or when backups take a long time to complete. For backups that take a long time to complete, running the utility in REST mode avoids the problem of the Red Hat OpenShift user session token expiring before the backup process completes. If the session token expires, you must log back in to the cluster and reset the utility.
- Log in to Red Hat
OpenShift Container Platform as a cluster
administrator:
${OC_LOGIN}Remember:OC_LOGINis an alias for theoc logincommand. - To create backups in REST mode, run the following
command:
cpd-cli oadp client config set runtime-mode=rest-client - To change the IBM Software Hub
OADP backup and restore utility back to the
Kubernetes mode, run the following
command:
cpd-cli oadp client config set runtime-mode=
Related topic: Unable to run an online backup or restore operation
2.2 Backing up the scheduling service
If the IBM Software Hub scheduling service is installed, create a backup of the service.
Backups that are created in IBM Cloud Pak for Data 5.0 cannot be restored in IBM Software Hub 5.1.0. You must take new backups in 5.1.0.
Check the Known issues and limitations for IBM Software Hub page for any workarounds that you might need to do before you create a backup.
- If you are running the backup and restore utility in Kubernetes mode, log in to Red Hat
OpenShift Container Platform as a cluster
administrator:
${OC_LOGIN}Remember:OC_LOGINis an alias for theoc logincommand. - Configure the OADP client to set the IBM Software Hub project to the scheduling service
project:
cpd-cli oadp client config set cpd-namespace=${PROJECT_SCHEDULING_SERVICE} - Configure the OADP client to set the OADP project to the project where the OADP operator is
installed:
cpd-cli oadp client config set namespace=${OADP_PROJECT} - Run service backup prechecks:
- IBM Software Hub 5.1.0
-
cpd-cli oadp backup precheck \ --include-namespaces=${PROJECT_SCHEDULING_SERVICE} \ --log-level=debug \ --verbose \ --hook-kind=br - IBM Software Hub 5.1.1 and later
-
cpd-cli oadp backup precheck \ --backup-type singleton \ --include-namespaces=${PROJECT_SCHEDULING_SERVICE} \ --log-level=debug \ --verbose \ --hook-kind=br
- Back up the IBM Software Hub scheduling service:
- The cluster pulls images from the IBM Entitled Registry
-
- IBM Software Hub 5.1.0
-
cpd-cli oadp backup create ${PROJECT_SCHEDULING_SERVICE}-offline \ --include-namespaces ${PROJECT_SCHEDULING_SERVICE} \ --include-resources='operatorgroups,configmaps,catalogsources.operators.coreos.com,subscriptions.operators.coreos.com,customresourcedefinitions.apiextensions.k8s.io,scheduling.scheduler.spectrumcomputing.ibm.com' \ --prehooks=true \ --posthooks=true \ --log-level=debug \ --verbose \ --hook-kind=br \ --selector 'velero.io/exclude-from-backup notin (true)' \ --image-prefix=registry.redhat.io/ubi9 - IBM Software Hub 5.1.1 and later
-
cpd-cli oadp backup create ${PROJECT_SCHEDULING_SERVICE}-offline \ --backup-type singleton \ --include-namespaces ${PROJECT_SCHEDULING_SERVICE} \ --include-resources='operatorgroups,configmaps,catalogsources.operators.coreos.com,subscriptions.operators.coreos.com,customresourcedefinitions.apiextensions.k8s.io,scheduling.scheduler.spectrumcomputing.ibm.com' \ --prehooks=true \ --posthooks=true \ --log-level=debug \ --verbose \ --hook-kind=br \ --selector 'velero.io/exclude-from-backup notin (true)' \ --image-prefix=registry.redhat.io/ubi9
- The cluster pulls images from a private container registry
-
- IBM Software Hub 5.1.0
-
cpd-cli oadp backup create ${PROJECT_SCHEDULING_SERVICE}-offline \ --include-namespaces ${PROJECT_SCHEDULING_SERVICE} \ --include-resources='operatorgroups,configmaps,catalogsources.operators.coreos.com,subscriptions.operators.coreos.com,customresourcedefinitions.apiextensions.k8s.io,scheduling.scheduler.spectrumcomputing.ibm.com' \ --prehooks=true \ --posthooks=true \ --log-level=debug \ --verbose \ --hook-kind=br \ --selector 'velero.io/exclude-from-backup notin (true)' \ --image-prefix=PRIVATE_REGISTRY_LOCATION/ubi9 - IBM Software Hub 5.1.1 and later
-
cpd-cli oadp backup create ${PROJECT_SCHEDULING_SERVICE}-offline \ --backup-type singleton \ --include-namespaces ${PROJECT_SCHEDULING_SERVICE} \ --include-resources='operatorgroups,configmaps,catalogsources.operators.coreos.com,subscriptions.operators.coreos.com,customresourcedefinitions.apiextensions.k8s.io,scheduling.scheduler.spectrumcomputing.ibm.com' \ --prehooks=true \ --posthooks=true \ --log-level=debug \ --verbose \ --hook-kind=br \ --selector 'velero.io/exclude-from-backup notin (true)' \ --image-prefix=PRIVATE_REGISTRY_LOCATION/ubi9
- Validate the backup:
- IBM Software Hub 5.1.0
-
cpd-cli oadp backup validate \ --include-namespaces=${PROJECT_SCHEDULING_SERVICE} \ --backup-names ${PROJECT_SCHEDULING_SERVICE}-offline \ --log-level trace \ --verbose \ --hook-kind=br - IBM Software Hub 5.1.1 and later
-
cpd-cli oadp backup validate \ --backup-type singleton \ --include-namespaces=${PROJECT_SCHEDULING_SERVICE} \ --backup-names ${PROJECT_SCHEDULING_SERVICE}-offline \ --log-level trace \ --verbose \ --hook-kind=br
2.3 Backing up an IBM Software Hub instance
Create an offline backup of each IBM Software Hub instance, or tenant, in your environment by doing the following steps.
-
To create Restic backups, if IBM Software Hub is installed on NFS, NFS storage must be configured with
no_root_squash. -
When backup commands are run, some pods remain in a
Runningstate. These running pods do not affect the backup process, and you do not need to manually shut them down. - The storage provider that you use to store backups might limit the number of snapshots that you can take per volume. For more information, consult your storage provider documentation.
- For s390x clusters (IBM Z and LinuxONE), you must run the backup and restore commands from an x86_64 workstation.
- This section shows you how to create a backup by using the IBM Software Hub 5.1.0 command. You can still create a backup by using the IBM Cloud Pak for Data 5.0 backup commands instead. For details, see Creating an offline backup of IBM Cloud Pak for Data with the OADP utility in the IBM Cloud Pak for Data 5.0 documentation.
Check the Known issues and limitations for IBM Software Hub page for any workarounds that you might need to do before you create a backup.
- If you are running the backup and restore utility in Kubernetes mode, log in to Red Hat
OpenShift Container Platform as a cluster
administrator:
${OC_LOGIN}Remember:OC_LOGINis an alias for theoc logincommand. - 5.1.0
Ensure that the expected
EDB Postgres replica PVCs are included in
the
backup:
oc label pvc,pods -l k8s.enterprisedb.io/cluster,velero.io/exclude-from-backup=true velero.io/exclude-from-backup- -n ${PROJECT_CPD_INST_OPERANDS} - Create a backup by running one of the following commands.
- The cluster pulls images from the IBM Entitled Registry
-
cpd-cli oadp tenant-backup create ${TENANT_OFFLINE_BACKUP_NAME} \ --namespace ${OADP_PROJECT} \ --vol-mnt-pod-mem-request=1Gi \ --vol-mnt-pod-mem-limit=4Gi \ --tenant-operator-namespace ${PROJECT_CPD_INST_OPERATORS} \ --mode offline \ --image-prefix=registry.redhat.io/ubi9 \ --log-level=debug \ --verbose &> ${TENANT_OFFLINE_BACKUP_NAME}.log& - The cluster pulls images from a private container registry
-
cpd-cli oadp tenant-backup create ${TENANT_OFFLINE_BACKUP_NAME} \ --namespace ${OADP_PROJECT} \ --vol-mnt-pod-mem-request=1Gi \ --vol-mnt-pod-mem-limit=4Gi \ --tenant-operator-namespace ${PROJECT_CPD_INST_OPERATORS} \ --mode offline \ --image-prefix=PRIVATE_REGISTRY_LOCATION/ubi9 \ --log-level=debug \ --verbose &> ${TENANT_OFFLINE_BACKUP_NAME}.log&
Note: If the backup fails during the volume backup stage, try increasing the--vol-mnt-pod-mem-limitoption. You might need to increase this option when you have terabytes of data. - Confirm that the tenant backup was created and has a
Completedstatus:cpd-cli oadp tenant-backup list - To view the detailed status of the backup, run the following
command:
cpd-cli oadp tenant-backup status ${TENANT_BACKUP_NAME} \ --detailsThe command shows the following sub-backups:Backup Description cpd-tenant-xxx Backup that contains Kubernetes resources. cpd-tenant-vol-yyy Backup that contains volume data. Tip: If you need more information, listed in the status details are sub-backups (of typegroup). You can view more information about these sub-backups by running the following command:cpd-cli oadp backup status <SUB_BACKUP_NAME> \ --details - To view logs of the tenant backup and all sub-backups, run the
following
command:
cpd-cli oadp tenant-backup log ${TENANT_BACKUP_NAME}
2.4 Doing post-backup tasks
For some services, you must do additional tasks after you create an offline backup.
- 5.1.2 and later If RStudio Server
Runtimes is installed, remove the RStudio Server
Runtimes service from maintenance
mode:
oc patch -n ${PROJECT_CPD_INST_OPERANDS} rstudioaddon rstudio-cr --type=merge -p '{"spec": {"ignoreForMaintenance":false}}' oc -n ${PROJECT_CPD_INST_OPERANDS} get rstudio -w - 5.1.0 If Data Refinery is installed, restart the service:
-
Log in to Red Hat OpenShift Container Platform as a cluster administrator.
${OC_LOGIN}Remember:OC_LOGINis an alias for theoc logincommand. - Run the following command.
The value of
<number_of_replicas>depends on thescaleConfigsetting when Data Refinery was installed (1 for small, 3 for medium, and 4 for large).oc scale --replicas=<number_of_replicas> deploy wdp-shaper wdp-dataprep
-
- 5.1.0 If
watsonx.ai is installed, manually
scale up three deployments.
-
Log in to Red Hat OpenShift Container Platform as a cluster administrator.
${OC_LOGIN}Remember:OC_LOGINis an alias for theoc logincommand. - Wait for watsonxai-cr to reach the
Completedstate:oc get watsonxai -n ${PROJECT_CPD_INST_OPERANDS}Check that the command returns output such as in the following example:
NAME VERSION RECONCILED STATUS AGE watsonxai-cr 9.1.0 9.1.0 Completed 4d5h - Scale up the following
deployment:
oc scale deploy caikit-runtime-stack-operator -n ${PROJECT_CPD_INST_OPERATORS} --replicas=1
-
3. Preparing to restore IBM Software Hub to a different cluster
Complete the following prerequisite tasks before you restore an offline backup. Some tasks are service-specific, and need to be done only when those services are installed.
3.1 Preparing the target cluster
Prepare the target cluster that you want to use to restore IBM Software Hub.
- Make sure that the target cluster meets the following requirements:
- The target cluster has the same storage classes as the source cluster.
- For environments that use a private container registry, such as air-gapped environments, the target cluster has the same image content source policy as the source cluster. For details on configuring the image content source policy, see Configuring an image content source policy for IBM Software Hub software images.
- The target cluster must be able to pull software images. For details, see Updating the global image pull secret for IBM Software Hub.
- The deployment
environment of the target cluster is the same as the source cluster.
- The target cluster uses the same hardware architecture as the source cluster. For example, x86-64.
- The target cluster is on the same OpenShift version as the source cluster.
- The target cluster allows for the same node configuration as the source cluster. For example, if the source cluster uses a custom KubeletConfig, the target cluster must allow the same custom KubeletConfig.
- Moving between IBM Cloud and non-IBM Cloud deployment environments is not supported.
- If you are using node
labels as the method for identifying nodes in the cluster, re-create the labels on the target
cluster.Best practice: Use node labels instead of node lists when you are restoring a IBM Software Hub deployment to a different cluster, especially if you plan to enforce node pinning. Node labels enable node pinning with minimal disruption. To learn more, see Passing node information to IBM Software Hub.
- If your cluster pulls images from a private container registry or if your cluster is in a
restricted network, push images that the OADP backup and restore utility needs to the private container registry so that users can run the
restore commands against the cluster.
For details, see 2. Moving images for backup and restore to a private container registry.
- Install the components that the OADP
backup and restore utility uses.Tip: You need to use the same configuration information that you specified in the source cluster. For example, when you install OADP, use the same credentials and DataProtectionApplication configuration that was specified on the source cluster.
- Log in to Red Hat
OpenShift Container Platform as a cluster
administrator.
${OC_LOGIN}Remember:OC_LOGINis an alias for theoc logincommand. - Create the environment variables that the utility needs so that you can copy commands from the documentation and run them without making any changes.
- Create the ${OADP_PROJECT} project where you want to install the OADP operator.
- Annotate the ${OADP_PROJECT} project so that Restic
pods can be scheduled on all
nodes.
oc annotate namespace ${OADP_PROJECT} openshift.io/node-selector="" - Install the cpdbr-tenant service role-based access controls (RBACs).Note: Run the cpdbr installation command in the IBM Software Hub operators project even though the project does not yet exist in the target cluster. Do not manually create the project on the target cluster. The project is created during the IBM Software Hub restore process.
- The cluster pulls images from the IBM Entitled Registry
-
cpd-cli oadp install \ --component=cpdbr-tenant \ --namespace ${OADP_PROJECT} \ --tenant-operator-namespace ${PROJECT_CPD_INST_OPERATORS} \ --rbac-only \ --log-level=debug \ --verbose - The cluster pulls images from a private container registry
-
cpd-cli oadp install \ --component=cpdbr-tenant \ --namespace ${OADP_PROJECT} \ --tenant-operator-namespace ${PROJECT_CPD_INST_OPERATORS} \ --cpdbr-hooks-image-prefix=${PRIVATE_REGISTRY}/cpdbr-oadp:${VERSION} \ --cpfs-image-prefix=${PRIVATE_REGISTRY} \ --rbac-only \ --log-level=debug \ --verbose
- Install the Red
Hat
OADP operator.
- To install operators in a cluster with Internet access, see the Red Hat documentation Adding Operators to a cluster.
- To install operators in a cluster that is in a restricted network, see the Red Hat documentation Using Operator Lifecycle Manager on restricted networks.
- Create a secret in the
${OADP_PROJECT} project with the credentials of the S3-compatible object store
that you are using to store the backups.
Credentials must use alphanumeric characters and cannot contain special characters like the number sign (#).
- Create a file named credentials-velero that contains the credentials for
the object store:
cat << EOF > credentials-velero [default] aws_access_key_id=${ACCESS_KEY_ID} aws_secret_access_key=${SECRET_ACCESS_KEY} EOF - Create the secret.
The name of the secret must be cloud-credentials.
oc create secret generic cloud-credentials \ --namespace ${OADP_PROJECT} \ --from-file cloud=./credentials-velero
- Create a file named credentials-velero that contains the credentials for
the object store:
- Create the DataProtectionApplication (DPA) custom resource, and specify a name for
the instance.Tip: You can create the DPA custom resource manually or by using the
cpd-cli oadp dpa createcommand. However, if you use this command, you might need to edit the custom resource afterward to add options that are not available with the command. This step shows you how to manually create the custom resource.You might need to change some values.spec.configuration.restic.memoryspecifies the Restic memory limit. You might need to increase the Restic memory limit if Restic volume backups fail or hang on a large volume, indicated by Restic pod containers restarting due to an OOMKilled Kubernetes error.- If the object store is Amazon S3, you can
omit
s3ForcePathStyle. - For object stores with a self-signed certificate, add
backupLocations.velero.objectStorage.caCertand specify the base64 encoded certificate string as the value. For more information, see Use Self-Signed Certificate.
Important:spec.configuration.nodeAgent.timeoutspecifies the Restic timeout. The default is 1 hour. You might need to increase the Restic timeout if Restic backup or restore fails, indicated by pod volume timeout errors in the Velero log.- If only Restic backups are needed, under
spec.configuration.velero.defaultPlugins, removecsi. - The object storage information (
backupLocations.velero.objectStorage) in the source and target cluster DPA configurations must be identical.
- Recommended DPA configuration
-
The following example shows the recommended DPA configuration.
cat << EOF | oc apply -f - apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: dpa-sample spec: configuration: velero: customPlugins: - image: ${CPDBR_VELERO_PLUGIN_IMAGE_LOCATION} name: cpdbr-velero-plugin defaultPlugins: - aws - openshift - csi podConfig: resourceAllocations: limits: cpu: "${VELERO_POD_CPU_LIMIT}" memory: 4Gi requests: cpu: 500m memory: 256Mi resourceTimeout: 60m nodeAgent: enable: true uploaderType: restic timeout: 72h podConfig: resourceAllocations: limits: cpu: "${NODE_AGENT_POD_CPU_LIMIT}" memory: 32Gi requests: cpu: 500m memory: 256Mi tolerations: - key: icp4data operator: Exists effect: NoSchedule backupImages: false backupLocations: - velero: provider: aws default: true objectStorage: bucket: ${BUCKET_NAME} prefix: ${BUCKET_PREFIX} config: region: ${REGION} s3ForcePathStyle: "true" s3Url: ${S3_URL} credential: name: cloud-credentials key: cloud EOF - 5.1.0-5.1.2 DPA configuration if watsonx™ Orchestrate is installed
-
If your IBM Software Hub deployment includes watsonx Orchestrate, add the appcon-plugin to the DPA configuration. To obtain the link to the appcon-plugin image, see Backing up and restoring your IBM App Connect resources and persistent volumes on Red Hat OpenShift.
cat << EOF | oc apply -f - apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: dpa-sample spec: configuration: velero: customPlugins: - image: ${CPDBR_VELERO_PLUGIN_IMAGE_LOCATION} name: cpdbr-velero-plugin - image: '<appcon-plugin-image-link>' name: appcon-plugin defaultPlugins: - aws - openshift - csi podConfig: resourceAllocations: limits: cpu: "${VELERO_POD_CPU_LIMIT}" memory: 4Gi requests: cpu: 500m memory: 256Mi resourceTimeout: 60m nodeAgent: enable: true uploaderType: restic timeout: 72h podConfig: resourceAllocations: limits: cpu: "${NODE_AGENT_POD_CPU_LIMIT}" memory: 32Gi requests: cpu: 500m memory: 256Mi tolerations: - key: icp4data operator: Exists effect: NoSchedule backupImages: false backupLocations: - velero: provider: aws default: true objectStorage: bucket: ${BUCKET_NAME} prefix: ${BUCKET_PREFIX} config: region: ${REGION} s3ForcePathStyle: "true" s3Url: ${S3_URL} credential: name: cloud-credentials key: cloud EOF
- After you create the DPA, do the following checks.
- Check that the velero pods are running in the ${OADP_PROJECT}
project.
oc get po -n ${OADP_PROJECT}The node-agent daemonset creates one node-agent pod for each worker node. For example:NAME READY STATUS RESTARTS AGE openshift-adp-controller-manager-678f6998bf-fnv8p 2/2 Running 0 55m node-agent-455wd 1/1 Running 0 49m node-agent-5g4n8 1/1 Running 0 49m node-agent-6z9v2 1/1 Running 0 49m node-agent-722x8 1/1 Running 0 49m node-agent-c8qh4 1/1 Running 0 49m node-agent-lcqqg 1/1 Running 0 49m node-agent-v6gbj 1/1 Running 0 49m node-agent-xb9j8 1/1 Running 0 49m node-agent-zjngp 1/1 Running 0 49m velero-7d847d5bb7-zm6vd 1/1 Running 0 49m - Verify that the backup storage location
PHASEisAvailable.cpd-cli oadp backup-location listExample output:
NAME PROVIDER BUCKET PREFIX PHASE LAST VALIDATED ACCESS MODE dpa-sample-1 aws ${BUCKET_NAME} ${BUCKET_PREFIX} Available <timestamp>
- Check that the velero pods are running in the ${OADP_PROJECT}
project.
- Log in to Red Hat
OpenShift Container Platform as a cluster
administrator.
- Install the jq JSON command-line utility.
- Configure the IBM Software Hub OADP backup and restore utility.
- Install
Certificate manager and the IBM License Service.
For details, see Installing shared cluster components for IBM Software Hub.
Note: You must install the same version of Certificate manager and the IBM License Service that is installed on the source cluster. - If IBM Knowledge Catalog Premium or IBM Knowledge Catalog Standard is installed, install Red Hat OpenShift AI.
3.2 Cleaning up the target cluster after a previous restore
If you previously restored a IBM Software Hub backup or a previous restore attempt was unsuccessful, delete the IBM Software Hub instance projects (namespaces) in the target cluster before you try another restore.
Resources in the IBM Software Hub instance are watched and managed by operators and controllers that run in other projects. To prevent corruption or out of sync operators and resources when you delete a IBM Software Hub instance, Kubernetes resources that have finalizers specified in metadata must be located, and those finalizers must be deleted before you can delete the IBM Software Hub instance.
-
Log in to Red Hat OpenShift Container Platform as an instance administrator.
${OC_LOGIN}Remember:OC_LOGINis an alias for theoc logincommand. - Download the cpd-pre-restore-cleanup.sh script from https://github.com/IBM/cpd-cli/tree/master/cpdops/5.1.3.
- If the tenant operator project exists and has the
common-service
NamespaceScopecustom resource that identifies all the tenant projects, run the following command:./cpd-pre-restore-cleanup.sh --tenant-operator-namespace="${PROJECT_CPD_INST_OPERATORS}" - If the tenant operator project does not exist or specific IBM Software Hub projects need to be deleted, run the
following command.
If the common-service
NamespaceScopecustom resource is not available and additional projects, such as tethered projects, need to be deleted, modify the list of comma-separated projects in the--additional-namespacesoption as necessary../cpd-pre-restore-cleanup.sh --additional-namespaces="${PROJECT_CPD_INST_OPERATORS},${PROJECT_CPD_INST_OPERANDS}" - If the IBM Software Hub
scheduling service was installed, uninstall
it.
For details, see Uninstalling the scheduling service.
4. Restoring IBM Software Hub to a different cluster
Restore an offline backup of a IBM Software Hub deployment to a different cluster by doing the following tasks.
4.1 Restoring the scheduling service
If the IBM Software Hub scheduling service is installed on the source cluster, restore the service on the target cluster by doing the following steps.
Check the Known issues and limitations for IBM Software Hub page for any workarounds that you might need to do before you restore a backup.
- Log in to Red Hat
OpenShift Container Platform as a cluster
administrator:
${OC_LOGIN}Remember:OC_LOGINis an alias for theoc logincommand. - Restore an offline backup by running one of the following commands.
- The cluster pulls images from the IBM Entitled Registry
-
cpd-cli oadp restore create ${PROJECT_SCHEDULING_SERVICE}-restore \ --from-backup=${PROJECT_SCHEDULING_SERVICE}-offline \ --include-resources='operatorgroups,configmaps,catalogsources.operators.coreos.com,subscriptions.operators.coreos.com,customresourcedefinitions.apiextensions.k8s.io,scheduling.scheduler.spectrumcomputing.ibm.com' \ --include-cluster-resources=true \ --skip-hooks \ --log-level=debug \ --verbose \ --image-prefix=registry.redhat.io/ubi9 - The cluster pulls images from a private container registry
-
cpd-cli oadp restore create ${PROJECT_SCHEDULING_SERVICE}-restore \ --from-backup=${PROJECT_SCHEDULING_SERVICE}-offline \ --include-resources='operatorgroups,configmaps,catalogsources.operators.coreos.com,subscriptions.operators.coreos.com,customresourcedefinitions.apiextensions.k8s.io,scheduling.scheduler.spectrumcomputing.ibm.com' \ --include-cluster-resources=true \ --skip-hooks \ --log-level=debug \ --verbose \ --image-prefix=${PRIVATE_REGISTRY_LOCATION}
4.2 Restoring an IBM Software Hub instance
Restore an IBM Software Hub instance to a different cluster by doing the following steps.
-
Check the Known issues and limitations for IBM Software Hub page for any workarounds that you might need to do before you restore a backup.
-
You cannot restore a backup to a different project of the IBM Software Hub instance.
-
If service-related custom resources are manually placed into maintenance mode prior to creating an online backup, those custom resources will remain in the same state if the backup is restored. Taking these services out of maintenance mode must be done manually after the restore.
- For s390x clusters (IBM Z and LinuxONE), you must run the backup and restore commands from an x86_64 workstation
-
If running a restore command produces a
FailedorPartiallyFailederror, you must clean up the IBM Software Hub instance and restart the restore process.
Check the Known issues and limitations for IBM Software Hub page for any workarounds that you might need to do before you restore a backup.
- Log in to Red Hat
OpenShift Container Platform as a cluster
administrator:
${OC_LOGIN}Remember:OC_LOGINis an alias for theoc logincommand. - Restore IBM Software Hub by running one of the
following commands.
- The cluster pulls images from the IBM Entitled Registry
-
cpd-cli oadp tenant-restore create ${TENANT_OFFLINE_BACKUP_NAME}-restore \ --from-tenant-backup ${TENANT_OFFLINE_BACKUP_NAME} \ --image-prefix=registry.redhat.io/ubi9 \ --verbose \ --log-level=debug &> ${TENANT_OFFLINE_BACKUP_NAME}-restore.log& - The cluster pulls images from a private container registry
-
cpd-cli oadp tenant-restore create ${TENANT_OFFLINE_BACKUP_NAME}-restore \ --from-tenant-backup ${TENANT_OFFLINE_BACKUP_NAME} \ --image-prefix=${PRIVATE_REGISTRY_LOCATION}/ubi9 \ --verbose \ --log-level=debug &> ${TENANT_OFFLINE_BACKUP_NAME}-restore.log&
- Get the status of the installed components:
-
Log the
cpd-cliin to the Red Hat OpenShift Container Platform cluster:${CPDM_OC_LOGIN}Remember:CPDM_OC_LOGINis an alias for thecpd-cli manage login-to-ocpcommand. - Run the appropriate command for your environment:
- Installations with tethered projects
-
cpd-cli manage get-cr-status \ --cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS} \ --tethered_instance_ns=${PROJECT_CPD_INSTANCE_TETHERED_LIST} - Installations without tethered projects
-
cpd-cli manage get-cr-status \ --cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS}
- Ensure that the status of all of the services is
CompletedorSucceeded.
-
- To view a list of restores, run the following
command:
cpd-cli oadp tenant-restore list - To view the detailed status of the restore, run the following
command:
cpd-cli oadp tenant-restore status ${TENANT_BACKUP_NAME}-restore \ --detailsThe command shows a varying number of sub-restores in the following form:cpd-tenant-r-xxxTip: If you need more information, listed in the status details are sub-restores (of typegroup). You can view more information about these sub-restores by running the following command:cpd-cli oadp restore status <SUB_RESTORE_NAME> \ --details - To view logs of the tenant restore, run the following
command:
cpd-cli oadp tenant-restore log ${TENANT_BACKUP_NAME}-restore
5. Completing post-restore tasks
Complete additional tasks for the control plane and some services after you restore an IBM Software Hub deployment from an offline backup.
5.1 Applying cluster HTTP proxy settings or other RSI patches to the control plane
cpd-cli manage apply-rsi-patches --cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS} -vvv5.2 Patching Cognos Analytics instances
- Patch the content store and audit database ports in the Cognos
Analytics service instance by running the following
script:
#!/usr/bin/env bash #----------------------------------------------------------------------------- #Licensed Materials - Property of IBM #IBM Cognos Products: ca #(C) Copyright IBM Corp. 2024 #US Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule #----------------------------------------------------------------------------- set -e #set -x function usage { echo $0: usage: $0 [-h] -t tethered_namespace -a audit_db_port_number -c cs_db_port_number [-v] } function help { usage echo "-h prints help to the console" echo "-t tethered namespace (required)" echo "-a Audit DB port number" echo "-c CS DB port number" echo "-v turn on verbose mode" echo "" exit 0 } while getopts ":ht:a:c:v" opt; do case ${opt} in h) help ;; t) tethered_namespace=$OPTARG ;; a) audit_db_port_number=$OPTARG ;; c) cs_db_port_number=$OPTARG ;; v) verbose_flag="true" ;; ?) usage exit 0 ;; esac done if [[ -z ${tethered_namespace} ]]; then echo "A tethered namespace must be provided" help fi echo "Get CAServiceInstance Name" cr_name=$(oc -n ${tethered_namespace} get caserviceinstance --no-headers -o custom-columns=NAME:.metadata.name) if [[ -z ${cr_name} ]]; then echo "Unable to find CAServiceInstance CR for namespace: ${tethered_namespace}" help fi if [[ ! -z ${cs_db_port_number} ]]; then echo "Updating CS Database Port Number in the Custom Resource ${cr_name}..." oc patch caserviceinstance ${cr_name} --type merge -p "{\"spec\":{\"cs\":{\"database_port\":\"${cs_db_port_number}\"}}}" -n ${tethered_namespace} fi if [[ ! -z ${audit_db_port_number} ]]; then echo "Updating Audit Database Port Number in the Custom Resource ${cr_name}..." oc patch caserviceinstance ${cr_name} --type merge -p "{\"spec\":{\"audit\":{\"database_port\":\"${audit_db_port_number}\" }}}" -n ${tethered_namespace} fi sleep 20 check_status="Completed" - Check the status of the Cognos
Analytics reconcile
action:
for i in {1..240};do caStatus=$(oc get caserviceinstance ${cr_name} -o jsonpath="{.status.caStatus}" -n ${tethered_namespace}) if [[ ${caStatus} == ${check_status} ]];then echo "ca ${check_status} Successfully" break elif [[ ${caStatus} == "Failed" ]];then echo "ca ${caStatus}!" exit 1 fi echo "ca Status: ${caStatus}" sleep 30 done
5.3 Restarting Data Replication replications
- Connect to the restored IBM Software Hub instance.
- Go to the restored replications and stop them.
- Restart the replications.
5.4 Restoring Db2
5.5 Restoring Db2 Warehouse
5.6 Restarting IBM Knowledge Catalog lineage pods
- wkc-data-lineage-service-xxx
- wdp-kg-ingestion-service-xxx
- Log in to Red Hat
OpenShift Container Platform as a cluster
administrator:
${OC_LOGIN}Remember:OC_LOGINis an alias for theoc logincommand. - Restart the wkc-data-lineage-service-xxx
pod:
oc delete -n ${PROJECT_CPD_INST_OPERANDS} "$(oc get pods -o name -n ${PROJECT_CPD_INST_OPERANDS} | grep wkc-data-lineage-service)" - Restart the wdp-kg-ingestion-service-xxx
pod:
oc delete -n ${PROJECT_CPD_INST_OPERANDS} "$(oc get pods -o name -n ${PROJECT_CPD_INST_OPERANDS} | grep wdp-kg-ingestion-service)"
5.7 Verifying the Watson Machine Learning restore operation
After restoring from a backup, users might be unable to deploy new models and score existing models. To resolve this issue, after the restore operation, wait until operator reconciliation completes.
-
Log in to Red Hat OpenShift Container Platform as a user with sufficient permissions to complete the task.
${OC_LOGIN}Remember:OC_LOGINis an alias for theoc logincommand. - Check the status of the operator with the following
commands:
export PROJECT_WML=<wml-namespace>kubectl describe WmlBase wml-cr -n ${PROJECT_WML} | grep "Wml Status" | awk '{print $3}' - After backup and restore operations, before using Watson
Machine Learning, make sure that the
wml-cris incompletedstate and all the wml pods are inrunningstate. Use this command to check that all wml pods are inrunningstate:oc get pods -n <wml-namespace> -l release=wml
5.8 Retraining existing watsonx Assistant skills
After restoring the watsonx Assistant backup, it is necessary to retrain the existing skills. This involves modifying a skill, to trigger training. The training process for a skill typically requires less than 10 minutes to complete. For more information, see the Retraining your backend model section in the IBM Cloud documentation.
5.9 Starting the ibm-granite-20b-code-cobol-v1-predictor pod in the watsonx Code Assistant for Z service
5.1.2 and later If the ibm-granite-20b-code-cobol-v1-predictor pod is not running, start it.
- Check whether the ibm-granite-20b-code-cobol-v1-predictor pod is in a
Runningstate by running the following command:oc get po -n ${PROJECT_CPD_INST_OPERANDS} | grep ibm-granite-20b-code-cobol-v1-predictor - If the ibm-granite-20b-code-cobol-v1-predictor pod is not in a
Runningstate, edit the pod:oc edit deploy -n ${PROJECT_CPD_INST_OPERANDS} ibm-granite-20b-code-cobol-v1-predictor - Under
startupProbe, check ifinitialDelaySeconds: 200is missing. If it is missing, add it.... startupProbe: failureThreshold: 200 httpGet: path: /health port: http scheme: HTTP initialDelaySeconds: 200 periodSeconds: 10 ...
5.10 Restoring services that do not support offline backup and restore
The following list shows the services that don't support offline backup and restore. If any of these services are installed in your IBM Software Hub deployment, do the appropriate steps to make them functional after a restore.
- Data Gate
- Data Gate synchronizes Db2 for z/OS data in real time. After IBM Software Hub is restored, data might be out of sync from Db2 for z/OS. It is recommended that you re-add tables after IBM Software Hub foundational services are restored.
- MongoDB
- The service must be deleted and reinstalled. Recreate the instance as a new instance, and then restore the data with MongoDB tools. For more information, see Installing the MongoDB service and Back Up and Restore with MongoDB Tools.
- Watson Discovery
-
The service must be uninstalled, reinstalled, then the data restored.
- For more information about how to uninstall Watson Discovery, see Uninstalling Watson Discovery.
- For more information about how to reinstall Watson Discovery, see Installing Watson Discovery.
- For more information about how to restore the data, see Backing up and restoring data in IBM Software Hub.
- Watson Speech services
- The service is functional and you can re-import data. For more information, see Importing and exporting data.