Restore with Portworx DR fails with operand request and subscription timeout warnings
Restoring a Cloud Pak for Data deployment with Portworx Disaster Recovery times out during operand request and subscription installation.
Symptoms
After you run the post-restore script, the
log file contains warnings like in the following
examples:
<time> Time: <timestamp> level=info - OperandRequest: zen-ca-operand-request - phase: Installing
<time> Time: <timestamp> level=warning - Create OperandRequest Timeout Warning
<time> Time: <timestamp> level=info - OperandRequest: ibm-iam-service - phase: Installing
<time> Time: <timestamp> level=warning - Create OperandRequest Timeout Warning
<time> Time: <timestamp> level=info - OperandRequest: ibm-iam-request - phase: Installing
<time> Time: <timestamp> level=warning - Create OperandRequest Timeout Warning
<time> Time: <timestamp> level=info - OperandRequest: zen-service - phase: Installing
<time> Time: <timestamp> level=warning - Create OperandRequest Timeout Warning
<time> Time: <timestamp> level=info - Subscription: manta-adl-operator - currentCSV: null - installedCSV: null
<time> Time: <timestamp> level=warning - Create Subscription Timeout Warning
<time> Time: <timestamp> level=info - Subscription: rabbitmq-operator - currentCSV: null - installedCSV: null
<time> Time: <timestamp> level=warning - Create Subscription Timeout Warning
Diagnosing the problem
Do the following steps:
- Download the logs from the OLM
pod:
oc logs -n openshift-operator-lifecycle-manager $(oc get po -n openshift-operator-lifecycle-manager -l app=olm-operator -o NAME)
- Verify that the subscription was
created:
oc get subscriptions.operators.coreos.com ibm-cpd-rstudio-operator-catalog-subscription -o json | jq '.status.currentCSV'
- If the previous step does not return a result, and a ClusterServiceVersion (CSV)
that is related to that subscription was created, run the following
command:
oc get csv -n ${PROJECT_CPD_INST_OPERATORS} | grep rstudio
If the command in the last step returns a CSV name, the problem is confirmed.
Resolving the problem
Do the following steps:
- Manually recreate the CSV.For example:
oc get subscriptions.operators.coreos.com ibm-cpd-rstudio-operator-catalog-subscription -o json | jq 'del(.status) | del(.metadata.resourceVersion) | del(.metadata.creationTimestamp) | del(.metadata.uid) | del(.metadata.annotations["kubectl.kubernetes.io/last-applied-configuration"]) | del(.metadata.generation)' &> rstudio-operator-catalog-sub.json
oc get csv ibm-cpd-rstudio.v9.1.0 -o json &> csv-ibm-cpd-rstudio.v9.1.0.json
oc delete subscriptions.operators.coreos.com ibm-cpd-rstudio-operator-catalog-subscription
oc delete csv ibm-cpd-rstudio.v9.1.0
oc apply -f rstudio-operator-catalog-sub.json
- Without cleaning up the restored Cloud Pak for Data
deployment, rerun the post-restore
script:
CPDBR_TENANT_SVC_POD=`oc get po -n ${PROJECT_CPD_INST_OPERATORS} | grep cpdbr-tenant-service- | grep "Running" | awk '{ print $1 }'` echo "cpdbr-tenant-service pod name=$CPDBR_TENANT_SVC_POD" oc exec -it -n ${PROJECT_CPD_INST_OPERATORS} ${CPDBR_TENANT_SVC_POD} -- \ bash -c "/cpdbr-scripts/cpdbr/cpdbr-post-restore.sh ${PROJECT_CPD_INST_OPERATORS} 30m"