Upgrade troubleshooting
You can upgrade watsonx Orchestrate you might face issues. The following troubleshoots help you to proceed.
Kafka issue
- Symptom
- Kafka resource fails verification checks and does not become ready.
- Root Cause
- The Kafka resource might be in a corrupted or inconsistent state, preventing proper initialization.
- Solution
- Delete and re-create Kafka resource
- Remove the existing Kafka resource to allow it to re-create:
oc delete kafka wo-watson-orchestrate-kafkaibm - Wait for automatic recreation The Kafka operator automatically re-creates the resource. Monitor
the recreation process:
Wait until the Kafka resource shows as Ready.oc get kafka -w
- Remove the existing Kafka resource to allow it to re-create:
Role-binding conflicts during upgrade
- Symptom
- Upgrade process becomes stuck at specific components (wxdengine wo-milvus or UAB component).
- Root Cause
- Existing role bindings or roles conflict with the upgrade process, preventing proper component initialization.
- Solution
-
- Delete role binding if stuck at wxdengine quo-milvus: If the upgrade is stuck at the wxdengine
wo-milvus component, delete the conflicting role binding:
oc delete rolebinding ibm-lakehouse-leader-election-rolebinding -n ${PROJECT_CPD_INST_OPERATORS} - Delete the role if stuck at UAB component: If the upgrade is stuck at the UAB component, delete
the conflicting
role:
oc delete role ibm-uab-ads-operator-role -n ${PROJECT_CPD_INST_OPERATORS} - Monitor upgrade progress: After you deleted the conflicting resources, monitor the upgrade to
help ensure that it proceeds:
oc get pods -n ${PROJECT_CPD_INST_OPERATORS} -w
- Delete role binding if stuck at wxdengine quo-milvus: If the upgrade is stuck at the wxdengine
wo-milvus component, delete the conflicting role binding:
Image pull errors
- Symptom
- Pods fail to start with ImagePullBackOff or ErrImagePull errors.
- Root Cause
- Entitlement key secrets are missing, incorrect, or not properly configured in the required namespaces.
- Solution
- Verify Entitlement Key secrets
- Verify the entitlement key secret is present in the namespace:
oc get secret ibm-entitlement-key -n ${PROJECT_CPD_INST_OPERATORS} - Verify secret content: Check that the secret contains the correct entitlement key:
oc get secret ibm-entitlement-key -n ${PROJECT_CPD_INST_OPERATORS} -o yaml - Recreate secret if necessary: If the secret is missing or incorrect, re-create it with your
valid entitlement key
oc create secret docker-registry ibm-entitlement-key \ --docker-server=cp.icr.io \ --docker-username=cp \ --docker-password= (your-entitlement-key) \ -n ${PROJECT_CPD_INST_OPERATORS}
- Verify the entitlement key secret is present in the namespace:
Upgrade failure on cpd-cli
- Symptom
- The
cpd-cliupgrade command fails with constraint satisfaction errors that are related to the events operator or Watson Assistant operator. - Error message
-
'constraints not satisfiable: bundle ibm-watson-assistant-operator.v5.8.0 requires an operator with package: ibm-elasticsearch-operator and with version in range: >=1.1.1474 2.0.0, subscription ibm-watson-assistant-operator-subscription exists, subscription ibm-watson-assistant-operator-subscription requires @existing/cpd-operators//ibm-watson-assistant-operator.v5.8.0' reason: ConstraintsNotSatisfiable
- Root Cause
- Version conflicts between operator subscriptions prevent the upgrade from proceeding. Different upgrade paths require different cleanup steps.
- Solution
- Check and clean up events operator subscription
- Check the events operator subscription: Verify the current state of the events operator
subscription:
oc get subs ibm-events-operator -n ${PROJECT_CPD_INST_OPERATORS} -o yaml
- Check the events operator subscription: Verify the current state of the events operator
subscription:
Bootstrap job failure
- Symptom
- The bootstrap job fails with a backoff limit error, preventing watsonx Orchestrate from initializing properly.
- Root Cause
- The bootstrap job is exceeded its retry limit due to persistent failures, requiring manual intervention to reset.
- Solution
- Check and clean up events operator subscription
- Check bootstrap job status: Verify the bootstrap job status and check for backoff limit errors:
oc get job wo-watson-orchestrate-bootstrap-job -n ${PROJECT_CPD_INST_OPERANDS} oc describe job wo-watson-orchestrate-bootstrap-job -n ${PROJECT_CPD_INST_OPERANDS} - Delete the failed bootstrap job: If the job is failed with a backoff limit error, delete it to
allow recreation:
oc delete job wo-watson-orchestrate-bootstrap-job -n ${PROJECT_CPD_INST_OPERANDS} - Wait for automatic recreation The operator automatically re-creates the bootstrap job. Monitor
its progress:
oc get job wo-watson-orchestrate-bootstrap-job -n ${PROJECT_CPD_INST_OPERANDS} -w
- Check bootstrap job status: Verify the bootstrap job status and check for backoff limit errors: