Known issues and limitations for watsonx Assistant

The following known issues and limitations apply to watsonx Assistant.

Kafka upgrade failure in Data Governor during 5.2.x to 5.4.0 upgrade

Applies to: 5.4.0

Problem
During the watsonx Assistant upgrade from Version 5.2.x to 5.4.0, Kafka attempts to upgrade from 3.8.0 to 4.2.0 but fails due to metadata remaining at version 3.8-IV0. This leads to Kafka broker and controller connection failures and prevents identification of the active controller, leading to an inconsistent and incomplete upgrade state.
Solution
  1. Restart the Data Governor Kafka related pods to force reconciliation.
    oc delete pods -l ibmevents.ibm.com/cluster=$INSTANCE-data-governor-kafka -n $NAMESPACE
    Where $INSTANCE=wa and $NAMESPACE=cpd
  2. Restarting the pods enables the Kafka cluster to complete the upgrade process and return to a Ready state.

watsonx Assistant CR is stuck in Inprogress state though the status shows Completed

Applies to: 5.4.0

Problem
This happens due to the data stores repeatedly entering a reconcile loop.
Solution
When the verification shows 20/20, you can ignore the InProgress state because the pods are healthy and fully available.

Upgrade from 5.2.x to 5.4.0 fails with ValueError

Applies to: 5.4.0

Problem
When you upgrade watsonx Assistant from Version 5.2.x to Version 5.4.0, you get ValueError when running the install-component command because the required storage class parameters are not specified.
Solution
To fix the issue, do the following:
  1. Check if the ValueError is due to the missing of blockStorageClass.
    oc get wa wa -n cpd -o jsonpath='{.spec.cluster.blockStorageClass}{"\n"}'
    If the output is empty, then blockStorageClass is missing from watsonx Assistant CR.
  2. Add the storage class flag to the CR.
    oc patch wa wa -n ${PROJECT_CPD_INST_OPERATORS} --type=merge -p '{"spec":{"cluster":{"blockStorageClass":"<block storage class name>"}}}'
    You don't need to rerun install-component command, as it reconciles automatically.

Embedded web chat gives a blank page with a CORS error

Applies to: 5.4.0

Problem
Embedded web chat gives a blank page with a CORS error.
Solution
  1. Set environment variables:
    export PROJECT_CPD_INSTANCE=<namespace where watsonx Assistant is running>
    export INSTANCE=<watsonx Assistant Instance Name>  # Normally "wa"
  2. Verify the values:
    echo $PROJECT_CPD_INSTANCE
    echo $INSTANCE
  3. Apply the patch using variables:
    
    cat <<EOF | oc apply -f -
    apiVersion: assistant.watson.ibm.com/v1
    kind: TemporaryPatch
    metadata:
      name: wa-gateway-cors-patch
      namespace: ${PROJECT_CPD_INSTANCE}
    spec:
      apiVersion: assistant.watson.ibm.com/v1
      kind: WatsonAssistant
      name: ${INSTANCE}
      patch:
        gateway:
          cr:
          - op: add
            path: /spec/cors
            value:
              disabled: false
              singleOrigin: "*"
      patchType: patchJson6902
    EOF

Analytics data is not populating after the upgrade due to a failure in the Data Governor Kafka upgrade

Applies to: 5.4.0

Problem
  • After the upgrade, new analytics data is not populated.
  • Installation of Data Governor (DG) Kafka might fail.
  • The Kafka Custom Resource shows the following status message:
    Message: Migration cannot be performed with Kafka version 3.9-IV0, metadata version 3.9-IV0, inter.broker.protocol.version 3.8-IV0, log.message.format.version 3.8-IV0. Please make sure the Kafka version, metadata version, inter.broker.protocol.version and log.message.format.version are all set to the same value, which must be equal to, or higher than 3.7.0
Cause
The KRaft migration and Kafka upgrade conflicted or did not complete successfully. This occurs when:
  • Kafka Zookeeper to KRaft migration occurs at the same time as a Kafka version upgrade.
  • Kafka version and protocol versions are mismatched.
  • Migration requires all versions to be aligned.
Solution
  1. Verify the Kafka CR status:
    oc get kafka
    Identify the Kafka instance that is associated with Data Governor.
  2. Delete the Kafka instance:
    oc delete kafka ${WA_NAME}-data-governor-kafka
    Replace ${WA_NAME} with your watsonx Assistant CR name.
  3. List all KafkaNodePools to identify the ones associated with your Kafka cluster:
    oc get kafkanodepool
  4. Delete all KafkaNodePools associated with the Kafka cluster (names may vary):
    oc delete kafkanodepool <kafkanodepool-name-1> <kafkanodepool-name-2>
    Refer to the following example for clarity:
    oc delete kafkanodepool wa-da-0f50-ibm-b3bb-broker wa-da-0f50-ibm-b3bb-controller
  5. After all Kafka resources are deleted, delete the ibm-data-governor-operator pod in the operator namespace to trigger a fresh reconciliation:
    Note: Delete only the operator pod. Don't delete the catalog source.
    oc delete pod -n <operator-namespace> -l icpdsupport/app=data-governor-operator
    Replace <operator-namespace> with your operator namespace.
  6. Automatic recreation of Kafka:
    • On the next Data Governor reconcile, Kafka is re-created with KRaft enabled as default.
    • After the deletion of old {WA_NAME}-data-governor-Kafka, no manual configuration is needed for KRaft.

Preview page not available when the watsonx Assistant is created using the API

Applies to: 5.4.0

Problem
The Preview page is not accessible when watsonx Assistant is created using the API.

Preview page not available for Watson Discovery integration

Applies to: 5.4.0

Problem
The Preview page does not appear when you integrate Watson Discovery in watsonx Assistant.

Postgres pod goes to CrashLoopBackOff status after the upgrade

Applies to: 5.4.0

Problem
When you upgrade watsonx Assistant, one of the Postgres pods goes to CrashLoopBackOff. This issue occurs because your data is corrupted.
Solution
  1. Run the following command to find the watsonx Assistant Postgres pod in the CrashLoopBackOff state.
    oc get pods --no-headers | grep -Ev "Comp|0/0|1/1|2/2|3/3|4/4|5/5|6/6|7/7|8/8" | grep wa-postgres
    The output looks like this:
    wa-postgres-3 0/1 CrashLoopBackOff 115 (2m30s ago) 9h
  2. Run the following command to identify if the Postgres pod is the primary pod:
    oc get cluster | grep wa-postgres
    The output looks like this:
    oc get cluster | grep wa-postgres
    wa-postgres         2d20h   3           3       Cluster in healthy state   wa-postgres-1
    Where wa-postgres-1 is the primary pod.
    Tip: If the primary instance is in CrashLoopBackOff status, do the steps in the Postgres cluster in bad state topic.
  3. Delete the non-primary pod and its PersistentVolumeClaim (PVC) to create a new pod that syncs with the primary pod.
    Warning: Do not delete a primary pod because it can lead to database downtime and potential data loss.
    oc delete pod/wa-postgres-3 pvc/wa-postgres-3
    Important: Ensure that the EDB operator is running before deleting the pod and its PVC.