Known issues for watsonx Orchestrate

The following known issues and limitations apply to watsonx Orchestrate.

watsonx Orchestrate has the following known issues:

During upgrade watsonx Orchestrate Kafka is not in ready state

Applies to: 5.3.0

Problem
watsonx Orchestrate Kafka resource does not enter the verifying state during upgrade.
Solution
Delete the Kafka resource so the system automatically re-creates it.
  1. Delete the watsonx Orchestrate Kafka resource.
    oc delete kafka wo-watson-orchestrate-kafkaibm -n ${PROJECT_CPD_INST_OPERANDS}
  2. Verify that the re-created watsonx Orchestrate Kafka resource is in ready state.
    oc get kafka wo-watson-orchestrate-kafkaibm -n ${PROJECT_CPD_INST_OPERANDS}
    Expected output:
    NAME               DESIRED KAFKA REPLICAS  DESIRED ZK REPLICAS  READY  METADATA STATE  WARNINGS
    wo-watson-orchestrate-kafkaibm  3            3           True  ZooKeeper    True

watsonx Orchestrate bootstrap job fails to complete during upgrade or installation

Applies to: 5.3.0

Problem
During upgrade or installation, the watsonx Orchestrate bootstrap job might fail to complete because the wo-skill-sequencing pods run out of memory.
Solution
Increase the memory limit for the wo-skill-sequencing pods by applying an RSI patch, then restart the bootstrap job and validate the upgrade.

Can ensure that the following prerequisites are met:

  • You are logged in to the OpenShift® cluster.
  • oc CLI and cpd-cli are installed and configured.
  • PROJECT_CPD_INST_OPERANDS is set to the watsonx Orchestrate operand namespace.
    export PROJECT_CPD_INST_OPERANDS=<cpd-instance-namespace>
  1. Create the RSI working directory.
    mkdir -p cpd-cli-workspace/olm-utils-workspace/work/rsi
  2. Create a patch file named skill-seq.json in the RSI directory with the following content.
    [
    {
    "op": "replace",
    "path": "/spec/containers/0/resources/limits/memory",
    "value": "3Gi"
    }
    ]
  3. Apply the RSI patch.
    cpd-cli manage create-rsi-patch
    --cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS}
    --patch_name=skill-seq-resource-limit
    --patch_type=rsi_pod_spec
    --patch_spec=cpd-cli-workspace/olm-utils-workspace/work/rsi/skill-seq.json
    --spec_format=json
    --include_labels=wo.watsonx.ibm.com/component:wo-skill-sequencing
    --state=active
  4. Restart the bootstrap job.
    oc delete job wo-watson-orchestrate-bootstrap-job
  5. Validate the upgrade.
    cpd-cli manage get-cr-status
    --cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS}
    --components=watsonx_orchestrate

After the fix is applied, new wo‑skill‑sequencing pods start with a 3 Gi memory limit, the bootstrap job completes successfully, and the watsonx Orchestrate upgrade proceeds normally.

watsonx Assistant created by using /assistants/watsonx API with service_instance_url fails to run in multi region active deployment cluster

Applies to: 5.3.0

Problem
When you add a watsonx Assistant that you created by using the /assistants/watsonx API and service_instance_url as a collaborator agent, the assistant fails to run.
Solution
  1. Use the following API endpoint to return the fully formatted service_instance_url for your assistant.
    https://<CPD_URL>/orchestrate/mfe_builder/api/v1/builder/assistants/watsonx/wxassistant/listfromwxa?crn_details=<CRN_VALUE>
    Note: This route uses the Builder MFE UI proxy and forwards the request to the builder_ui service.
  2. Use the service_instance_url from the previous step and include it in the request payload when you register the assistant by using the POST /orchestrate/api/v1/assistants/watsonx API.

Domain agents and tools UUIDs are not the same in two clusters

Applies to: 5.3.0

Problem
In multi region active deployment, in cluster-A, as an Admin user you see the UUIDs of Domain agents and tools in Catalog and see in Network tab, then go to cluster-B, and see the UUIDs of Domain agents and tools are different.

Restoring of the Milvus S3 backup requires manual job after the cluster restore and stabilization

Applies to: 5.3.0

Problem
You need to restore the backups manually after the cluster restore and stabilization of Milvus S3 bucket.

The deleted assistant cannot be re-created in Assistant Builder in multi region active deployed cluster

Applies to: 5.3.0

Problem
As a limitation, you cannot re-create an assistant re-create in Assistant Builder in multi region active deployed cluster.

The deleted instances cannot be re-created in a multi region active deployed cluster

Applies to: 5.3.0

Problem
As a limitation, you cannot re-create an instance by writing the same title of a previously deleted instance.

The UUID of an agent is different on deleting and importing again the same agent via ADK

Applies to: 5.3.0

Problem
In multi region active deployment, as an Admin if you delete the imported agent in Agent builder and again import the same agent in the same cluster through ADK, you would find a different UUID of the agent.

Search in catalog is not working

Applies to: 5.3.0

Problem
When you search by keyword in the catalog, you don't get the search result. The search result page shows No results found.

Not applicable (N/A) message is not displayed for specific image file

Applies to: 5.3.0

Problem
You might see the message - No key-value pairs were extracted from the document instead N/A message field for a specific png file.

Downtime when upgrading from 5.1.x or 5.2.x to 5.3.0

Applies to: 5.3.0

Problem
When you upgrade from 5.1.x or 5.2.x to 5.3, you might see a minimal downtime of around 3–4 minutes at the beginning of the upgrade.

Jira Domain agent tools are not working

Applies to: 5.3.0

Problem
On the Catalog, find Jira App and create a template. Run the tools and check response Get all Jira projects list and Jira Domain agent tools will not work.
Solution
Pass all optional values to avoid null values for default.

Audit Event Logging configured with CPD and collected in pod logs that is not compliant with FISMA

Applies to: 5.3.0

Problem
Audit Event Logging is configured with CPD and collected in pod logs that is not compliant with FISMA. These events contain critical user and system actions such as agent creation, tool updates, knowledge uploads, and behavioral changes. This gap limits accountability, troubleshooting, compliance tracking, and operational monitoring on CPD.
Solution
Configure the Audit logging on CPD. By default it is disabled. To enable the audit logging, run the following command:
oc patch wo wo --type=merge -p '{"spec": {"audit": {"enabled": true}}}' -n ${PROJECT_CPD_INST_OPERANDS}
and then wait for wo cr to be ready:
oc get wo wo -n ${PROJECT_CPD_INST_OPERANDS}

Unable to save Voice configuration if TTS has one voice

Applies to: 5.3.0

Problem
When you use the Speech Services on a CPD cluster, the Voice Config UI does not have option to finish and save the integration. It is only happening if the Speech Service has included only one voice.

Error during mirroring images

Applies to: 5.3.0

Problem
You get error during mirroring image with the IMAGE_GROUPS specific foundation models required by watsonx Orchestrate.
Solution
Use watsonx_ai_ifm to mirror the wanted models.

Intermittent Nginx upstream timeout errors when processing for API requests

Applies to: 5.3.0

Problem
You will get intermittent Nginx upstream timeout errors when you are processing API requests. After adding the global transaction ID in the uiproxy logs, you might not find an entry that is logged against the failed transaction in uiproxy or in the Archer server.

Unable to enter content in all text fields in the Salesforce connection window

Applies to: 5.3.0

Problem
In the Manage skills app screen, click Add skills arrow > click Configure prebuilt skills > click on Salesforce tile > click Connect App. Now, you will not be able to enter text in any field.
Solution
Click outside near the text fields then you focus on the text fields, you are able to enter text in fields.

Complete error message is not displayed after the assistant is deleted

Applies to: 5.3.0

Problem
You register the Assistant, in chat page, go to Assistant and delete it in new or next tab, do in chat page in the same thread. You cannot see the complete error masse.

Issue when add LDAP user to watsonx Orchestrate instance

Applies to: 5.3.0

Problem
In MyInstances screen, when you add LDAP user to watsonx Orchestrate instance and check the status that displays error [POST /grant/][500] addUserToServiceInstanceInternalServerError &{MessageCode: StatusCode:0 Exception: Message:}.
Solution
Add a user second time then it is adding successfully.

UAB is not provisioning

Applies to: 5.3.0

Problem
As admin, you log in to nonagentic tenant, go to Skill studio > click the Projects and Skills tab > check content displaying, you can see that UAB is not provisioning and Skill studio tab is not opening.

Error when starting gen AI component in chat

Applies to: 5.3.0

Problem
After you create a project with gen AI component, add skill and start the component in chat, you will see the 400 error.

Domain agents and tools are unavailable in the 5.2.2 cluster

Applies to: 5.3.0

Problem
When an admin searches for a domain agent from the Discover page, the results do not include all domain agents and tools. However, when the admin selects Agents from the left navigation page, the full list of agents is displayed.
Solution
Increase the number of skill-server pods.

Custom or imported skills show digression status and do not return responses

Applies to: 5.3.0

Problem
When an admin logs in to a nonagentic tenant and runs custom or imported skills from the chat screen, the skills do not return any response.

Error occurs when click Generate button in LLM model of gen AI project

Applies to: 5.3.0

Problem
In UAB on-premises environment you create a gen AI project, add variables choose LLM Model give token count then click Generate button, you see error Wrong machine learning URL path POST.

The deploy-knative-eventing fails with error: no matching resources found

Applies to: 5.3.0

Problem
When running the cpd-cli manage deploy-knative-eventing command, it fails with error: no matching resources found after the message deployment.apps/kafka-controller condition met. This issue arises because no pods with the label app=kafka-broker-dispatcher are present.
Solution
  1. Exec into the docker pod running olm utils.
    docker exec -it olm-utils-play-v3 bash
  2. Check for the line that must be removed.
    cat /opt/ansible/bin/deploy-knative-eventing | grep kafka-broker-dispatcher
    Output:
    cat /opt/ansible/bin/deploy-knative-eventing | grep kafka-broker-dispatcher
    oc wait pods -n knative-eventing --selector app=kafka-broker-dispatcher --for condition=Ready --timeout=60s
  3. Remove the line.
    sed -i '/kafka-broker-dispatcher/d' /opt/ansible/bin/deploy-knative-eventing
  4. Verify whether the line is removed.
    cat /opt/ansible/bin/deploy-knative-eventing | grep kafka-broker-dispatcher
    Output:
    cat /opt/ansible/bin/deploy-knative-eventing | grep kafka-broker-dispatcher
    

Upgrade is stuck at UAB component during upgrade from 5.2.0 to 5.2.1

Applies to: 5.3.0

Problem
When you upgrade from 5.2.0 to 5.2.1, it gets stuck at UAB component.
Solution
Restart the ADS run-service pod.

OOTB apps fail to connect in air-gap cluster

Applies to: 5.3.0

Problem
In a non-agentic tenant, go to Skill Catalog, open any OOTB app, then try to connect the application with credentials and check the response. You cannot connect OOTB apps in an air-gap cluster.

Actions Quick start with templates is visible in agentic tenant

Applies to: 5.3.0

Problem
Action Quick start with templates is visible in the agentic tenant UI, which must not be.

Backup and restore utility is not included for Milvus

Applies to: 5.3.0

Problem
The Backup and restore utility is not included in the watsonx Orchestrate s3 bucket data for Milvus. You need to take backup manually, and restore after the instance restore is completed by using a Kubernetes job.Pr-backup.
Solution
To resolve the problem, run the following script before you take the backup:
oc project ${PROJECT_CPD_INST_OPERANDS}
cat <<EOF | kubectl apply -f -
apiVersion: batch/v1
kind: Job
metadata:
  name: wo-watson-orchestrate-backup-milvus-s3-backup
  namespace: ${PROJECT_CPD_INST_OPERANDS}
  annotations:
    name: wo-watson-orchestrate-backup-milvus-s3-backup
    namespace: ${PROJECT_CPD_INST_OPERANDS}
  labels:
    app.kubernetes.io/component: backup
    app.kubernetes.io/instance: wo
    app.kubernetes.io/managed-by: ibm-watson-orchestrate-operator
    app.kubernetes.io/name: watson-orchestrate
    icpdsupport/addOnId: orchestrate
    icpdsupport/app: backup
    icpdsupport/module: backup-orchestrate
    icpdsupport/podSelector: backup
    wo.watsonx.ibm.com/application: watson-orchestrate
    wo.watsonx.ibm.com/component: backup
    wo.watsonx.ibm.com/cr-name: wo
    wo.watsonx.ibm.com/external-access: "true"
    wo.watsonx.ibm.com/operand-version: 6.0.0
spec:
  backoffLimit: 0
  template:
    metadata:
      annotations:
        productName: IBM watsonx Orchestrate
        productVersion: 5.2.0
      labels:
        app.kubernetes.io/component: backup
        app.kubernetes.io/instance: wo
        app.kubernetes.io/managed-by: ibm-watson-orchestrate-operator
        app.kubernetes.io/name: watson-orchestrate
        icpdsupport/addOnId: orchestrate
        icpdsupport/app: backup
        icpdsupport/module: backup-orchestrate
        icpdsupport/podSelector: backup
        wo.watsonx.ibm.com/application: watson-orchestrate
        wo.watsonx.ibm.com/component: backup
        wo.watsonx.ibm.com/cr-name: wo
        wo.watsonx.ibm.com/external-access: "true"
        wo.watsonx.ibm.com/operand-version: 6.0.0
    spec:
      serviceAccountName: wo-watson-orchestrate-backup-restore
      restartPolicy: Never
      containers:
        - name: backup
          image: cp.icr.io/cp/watsonx-orchestrate/ibm-watsonx-orchestrate-onprem-utils@sha256:f2ca697cdcea2f349f9b0304a3b28f19f5d3f917b57b9076bbae43052a8a9c20
          imagePullPolicy: Always
          command:
            - ./milvus-s3-br.sh
            - backup
          env:
            - name: JOB_NAME
              value: wo-watson-orchestrate-backup-milvus-s3-backup
            - name: JOB_NAMESPACE
              value: cpd-instance-1
          resources: {}
          volumeMounts:
            - name: milvus-secret
              mountPath: /secrets/wo-milvus-storage-bucket
            - name: milvus-configmap
              mountPath: /configmaps/wo-milvus-storage-bucket
            - name: s3-cert
              mountPath: /secrets/s3-cert
            - name: s3-backup-pvc
              mountPath: /tmp/s3-backup
      volumes:
        - name: milvus-secret
          secret:
            secretName: wo-milvus-storage-bucket
        - name: milvus-configmap
          configMap:
            name: wo-milvus-storage-bucket
        - name: s3-cert
          secret:
            secretName: noobaa-cert-watsonx-orchestrate
        - name: s3-backup-pvc
          persistentVolumeClaim:
            claimName: wo-watson-orchestrate-backup-s3
EOF

Run the following script after you restore:

oc project ${PROJECT_CPD_INST_OPERANDS}
cat <<EOF | kubectl apply -f -
apiVersion: batch/v1
kind: Job
metadata:
  name: wo-watson-orchestrate-backup-milvus-s3-restore
  namespace: ${PROJECT_CPD_INST_OPERANDS}
  annotations:
    name: wo-watson-orchestrate-backup-milvus-s3-restore
    namespace: ${PROJECT_CPD_INST_OPERANDS}
  labels:
    app.kubernetes.io/component: backup
    app.kubernetes.io/instance: wo
    app.kubernetes.io/managed-by: ibm-watson-orchestrate-operator
    app.kubernetes.io/name: watson-orchestrate
    icpdsupport/addOnId: orchestrate
    icpdsupport/app: backup
    icpdsupport/module: backup-orchestrate
    icpdsupport/podSelector: backup
    wo.watsonx.ibm.com/application: watson-orchestrate
    wo.watsonx.ibm.com/component: backup
    wo.watsonx.ibm.com/cr-name: wo
    wo.watsonx.ibm.com/external-access: "true"
    wo.watsonx.ibm.com/operand-version: 6.0.0
spec:
  backoffLimit: 0
  template:
    metadata:
      annotations:
        productName: IBM watsonx Orchestrate
        productVersion: 5.2.0
      labels:
        app.kubernetes.io/component: backup
        app.kubernetes.io/instance: wo
        app.kubernetes.io/managed-by: ibm-watson-orchestrate-operator
        app.kubernetes.io/name: watson-orchestrate
        icpdsupport/addOnId: orchestrate
        icpdsupport/app: backup
        icpdsupport/module: backup-orchestrate
        icpdsupport/podSelector: backup
        wo.watsonx.ibm.com/application: watson-orchestrate
        wo.watsonx.ibm.com/component: backup
        wo.watsonx.ibm.com/cr-name: wo
        wo.watsonx.ibm.com/external-access: "true"
        wo.watsonx.ibm.com/operand-version: 6.0.0
    spec:
      serviceAccountName: wo-watson-orchestrate-backup-restore
      restartPolicy: Never
      containers:
        - name: restore
          image: cp.icr.io/cp/watsonx-orchestrate/ibm-watsonx-orchestrate-onprem-utils@sha256:f2ca697cdcea2f349f9b0304a3b28f19f5d3f917b57b9076bbae43052a8a9c20
          imagePullPolicy: Always
          command:
            - ./milvus-s3-br.sh
            - restore
          env:
            - name: JOB_NAME
              value: wo-watson-orchestrate-backup-milvus-s3-restore
            - name: JOB_NAMESPACE
              value: cpd-instance-1
          resources: {}
          volumeMounts:
            - name: milvus-secret
              mountPath: /secrets/wo-milvus-storage-bucket
            - name: milvus-configmap
              mountPath: /configmaps/wo-milvus-storage-bucket
            - name: s3-cert
              mountPath: /secrets/s3-cert
            - name: s3-backup-pvc
              mountPath: /tmp/s3-backup
      volumes:
        - name: milvus-secret
          secret:
            secretName: wo-milvus-storage-bucket
        - name: milvus-configmap
          configMap:
            name: wo-milvus-storage-bucket
        - name: s3-cert
          secret:
            secretName: noobaa-cert-watsonx-orchestrate
        - name: s3-backup-pvc
          persistentVolumeClaim:
            claimName: wo-watson-orchestrate-backup-s3

EOF

Document extractor and document classifier not available in document processing

Applies to: 5.3.0

Problem
When creating a workflow, if you select Document processor and then click Create a document extractor, a blank screen is displayed.

Published Assistants are not listed in Agent configuration

Applies to: 5.3.0

Problem
When you install watsonx Orchestrate on a cluster without GPUs and add your published assistant, which is built by using Assistant Builder to Agent configuration, the published assistant is not listed.

Reset test run in Evaluate response settings fails with 500 error

Applies to: 5.3.0

Problem
When you reset, do the test-run from the Evaluate response settings page of AI assistant builder, it fails with a 500 Internal Server Error, and the UI hangs.
Solution
If the page hangs after the error, go to a different page and then return to the same page, or refresh the page.

The preview page not available when the watsonx Assistant is created by using the API

Applies to: 5.3.0

Problem
The Preview page is not accessible when watsonx Assistant is created by using the API.

Workflow skills response is not displaying in the chat UI after upgrade to Version 5.1.1

Applies to: 5.3.0

Problem
After you upgrade to Version 5.1.1 from Version 5.1.0, the workflow skills continue to run in Version 5.1.0, but the response is no longer displayed.
Solution
  1. Run the following command to export the namespace.
    export PROJECT_CPD_INST_OPERANDS=<namespace where watsonx
    Orchestrate is running>
  2. Restart the watsonx Orchestrate pods.
    oc rollout restart deployment/wo-digital-employee-server-deployment -n $PROJECT_CPD_INST_OPERANDS
    oc rollout restart deployment/wo-digital-employee-client-deployment -n $PROJECT_CPD_INST_OPERANDS
    oc rollout restart deployment/wo-openapi-provider-server -n $PROJECT_CPD_INST_OPERANDS
  3. After you restart the pods, allow around one to two hours for the system to stabilize and start functioning properly.

The document attribute of the composite data type variable is not displayed

Applies to: 5.3.0

Problem
Create a composite data type variable, including a document attribute and use this variable in a workflow to define input. However, after publishing, when you start this workflow in the AI chat, the file attachment icon might not be displayed on the UI.
Cause
This issue might occur in the AI chat when the composite data type is not recognized in the workflow.

Unable to see the utility bill extractor as a skill in chat output

Applies to: 5.3.0

Problem
Create a workflow by adding the Utility bill extractor skill from the Skill Catalog, then start the skill in watsonx Orchestrate chat and complete the task activity. You cannot see the output in the chat and going it to digression.

Salesforce Account Engagement skill execution does not work

Applies to: 5.3.0

Problem
watsonx Orchestrate cannot start Salesforce Account Engagement app skills.

Red Hat® OpenShift Horizontal Pod Autoscaler (HPA) is not enabled for ZenServices

Applies to: 5.3.0

Problem
HPA is not enabled for ZenServices.
Cause
ZenServices is not able to recognize the server to enable HPA.
Solution
The administrator can enable autoscaling for ZenServices.
NAME                                            REFERENCE                                                         TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
zen-lite-ibm-nginx-hpa                          Deployment/ibm-nginx                                              5%/280%   2         7         2          9h
zen-lite-usermgmt-hpa                           Deployment/usermgmt                                               0%/280%   2         7         2          9h
zen-lite-zen-audit-hpa                          Deployment/zen-audit                                              5%/298%   1         3         1          9h
zen-lite-zen-core-api-hpa                       Deployment/zen-core-api                                           2%/168%   2         7         2          9h
zen-lite-zen-core-hpa                           Deployment/zen-core                                               2%/420%   2         7         2          9h

Adding skills before synchronization results in an error

Applies to: 5.3.0

Error
When you add skills before synchronization, you might see the following error:
Failed to complete this operation due to the following errors. Failed to upskill skill because
{'status': 'failed', 'status_code':500, 'message':'"next_action"', 'detailed_message': None, 'other_details': None, 'is_available': True,
'skill_set_orch':[]}. Skill has not bootstrapped.
Cause
Synchronization of applications and skills takes up to five minutes to complete after the service instance creation.
Solution
Wait for up to five minutes to complete the synchronization of applications and skills.

Unable to add input to the connection fields of Salesforce

Applies to: 5.3.0

Problem
You are unable to add input to the connection fields of Salesforce because the focus does not stay on the input fields.
Cause
This issue is caused due to a known user interface issue.
Solution
Click the border of an input field to keep the focus on the field.

OAuth 2.0 web authentication is not supported for new skills

Applies to: 5.3.0

Problem
OAuth 2.0 web authentication is not supported when you are adding new skills
Cause
This issue is caused due to browser redirection.
Solution
Try to use a different authentication method.