Known issues and limitations for Watson Discovery

The following known issues and limitations apply to the Watson Discovery service.

The Watson Discovery CR (wd) gets stuck with 23/24 status when applying patch1 for the zen component

Applies to: 5.3.1

Error

When you apply patch1 for the zen component to the base version 5.3.1, Watson Discovery CR (wd) gets stuck with 23/24 status.

oc get wd -n zen
NAME   VERSION   READY   READYREASON   UPDATING   UPDATINGREASON   DEPLOYED   VERIFIED   QUIESCE        DATASTOREQUIESCE   AGE
wd     5.3.1     False   InProgress    True       VerifyWait       24/24      23/24      NOT_QUIESCED   NOT_QUIESCED       13h
oc get wd -n zen -o yaml
...
      failedComponents: []
      unverifiedComponents:
      - wire
      verified: 23/24
Cause

The zen component updates its secret with patch1. Certain Watson Discovery secrets that depend on the zen secret are not updated, which causes the ranker pods to crash.

oc get secret -n zen --sort-by=.metadata.creationTimestamp | grep -E 'wd-discovery|zen-ca' | tail -r
zen-ca-cert-secret                                   kubernetes.io/tls                3      6h46m
wd-discovery-jks-secret                              Opaque                           3      10m
wd-discovery-cert-manager-tls                        kubernetes.io/tls                3      7h10m
...
wd-discovery-jks-secret                              Opaque                           3      8h
...
wd-discovery-cn-postgres16-ca                        Opaque                           2      9h
wd-discovery-cn-postgres16-app                       kubernetes.io/basic-auth         11     9h
wd-discovery-cn-postgres16-su                        kubernetes.io/basic-auth         2      9h
oc get pods -n zen | grep -E 'ranker-master|ranker-rest|training-crud'
wd-discovery-ranker-master-67c8f665cd-njxqn               0/1     Running            1 (51s ago)     2m16s
wd-discovery-ranker-master-67c8f665cd-pfggl               0/1     Running            1 (42s ago)     2m16s
wd-discovery-ranker-rest-f49f897f-qkhkn                   0/1     CrashLoopBackOff   2 (9s ago)      2m15s
wd-discovery-ranker-rest-f49f897f-qrd28                   0/1     CrashLoopBackOff   2 (6s ago)      2m15s
...
Solution
To resolve this issue, complete the following steps:
  1. Manually delete the old secrets to refresh them, then restart the pods with errors.
    oc delete secret -n zen \
      wd-discovery-jks-secret \
      wd-discovery-cn-postgres16-ca \
      wd-discovery-cn-postgres16-wd \
      wd-discovery-cn-postgres16-replication
    secret "wd-discovery-jks-secret" deleted
    secret "wd-discovery-cn-postgres16-ca" deleted
    secret "wd-discovery-cn-postgres16-wd" deleted
    secret "wd-discovery-cn-postgres16-replication" deleted
    These secrets are automatically recreated after a while.
    oc get secret -n zen | grep 'wd-discovery-cn-postgres16'
    wd-discovery-cn-postgres16-app                       kubernetes.io/basic-auth         11     14h
    wd-discovery-cn-postgres16-ca                        Opaque                           2      45s
    wd-discovery-cn-postgres16-dockercfg-4tzt4           kubernetes.io/dockercfg          1      14h
    wd-discovery-cn-postgres16-replication               kubernetes.io/tls                2      45s
    wd-discovery-cn-postgres16-su                        kubernetes.io/basic-auth         2      14h
    wd-discovery-cn-postgres16-wd                        Opaque                           4      45s
  2. Restart the ranker pods.
    oc delete pod -n zen -l 'app.kubernetes.io/component in (ranker-master,ranker-rest,training-crud)'
    pod "wd-discovery-ranker-master-674d455cb9-7h75d" deleted
    pod "wd-discovery-ranker-master-674d455cb9-lwf6h" deleted
    pod "wd-discovery-ranker-rest-f7b9b979d-nkvdw" deleted
    pod "wd-discovery-ranker-rest-f7b9b979d-w668r" deleted
    pod "wd-discovery-training-crud-6577558dd4-wdstf" deleted
    

    Wait for the new ranker pods to start running.

    oc get pods -n zen | grep -E 'ranker-master|ranker-rest|training-crud'
    wd-discovery-ranker-master-674d455cb9-lgjm6                    1/1     Running     0               6m12s
    wd-discovery-ranker-master-674d455cb9-xwqfp                    1/1     Running     0               6m11s
    wd-discovery-ranker-rest-f7b9b979d-9fz87                       1/1     Running     0               5m26s
    wd-discovery-ranker-rest-f7b9b979d-bmjts                       1/1     Running     0               5m25s
    wd-discovery-training-crud-6577558dd4-qkgvv                    1/1     Running     0               3m33s
    wd-discovery-training-crud-6577558dd4-zlc9k                    1/1     Running     0               6m25s

    The wd CR also gets ready after a while.

    oc get wd -n zen
    NAME   VERSION   READY   READYREASON   UPDATING   UPDATINGREASON   DEPLOYED   VERIFIED   QUIESCE        DATASTOREQUIESCE   AGE
    wd     5.3.1     True    Stable        False      Stable           24/24      24/24      NOT_QUIESCED   NOT_QUIESCED       14h

Default noobaa resources might cause the OOMKilled error

Applies to: 5.3.0

Error

The noobaa resources might cause the OOMKilled error due to insufficient memory. This error is triggered especially during the Watson Discovery installation or upgrade, as these operations require significant access to the noobaa storage.

If you are using the noobaa backing store, run the following commands to verify and patch its resources. For other types of backing stores, refer to their respective documentation to adjust resource sizes.

oc get pods -n openshift-storage  
noobaa-default-backing-store-noobaa-pod-2a4a1886   0/1     CrashLoopBackOff   104 (62s ago)   32h
oc get pods -n openshift-storage noobaa-default-backing-store-noobaa-pod-2a4a1886 -o yaml
...
status:
...
    lastState:
      terminated:
        containerID: ...
        exitCode: 137
        finishedAt: "2026-01-10T23:44:45Z"
        reason: OOMKilled
        startedAt: "2026-01-10T23:41:42Z"
    name: noobaa-agent
    ready: false
    restartCount: 102
    started: false
    state:
      waiting:
        message: back-off 1m20s restarting failed container=noobaa-agent pod=noobaa-default-backing-store-noobaa-pod-2a4a1886_openshift-storage(68abc3f6-25d9-4ac6-87fa-c39f80f6e1af)
        reason: CrashLoopBackOff
As a result, the Watson Discovery pods that check noobaa contents might encounter download or access errors.
oc logs wd-discovery-orchestrator-setup-r9brl -c verify-resources
Verifying common-zen-wd/mt/__built-in-tenant__/fileResource/701db916-fc83-57ab-0000-000000000010.zip
    Check if common-zen-wd exists
    ...
    Check if object exists

Read timeout on endpoint URL: "https://s3.openshift-storage.svc:443/common-zen-wd?list-type=2&prefix=mt%2F__built-in-tenant__%2FfileResource%2F701db916-fc83-57ab-0000-000000000010.zip&delimiter=%2F&encoding-type=url"
    Object does not exist
    Retry after 60 seconds
Cause

Not enough resource requests for noobaa.

Solution
Increase memory size by patching the following resources.
oc patch -n openshift-storage backingStore/noobaa-default-backing-store --type merge --patch '{
    "spec": {
        "pvPool": {
            "resources": {
                "requests": {
                    "memory": "1Gi"
                },
                "limits": {
                    "memory": "1Gi"
                }
            }
        }
    }
}'

oc patch -n openshift-storage storagecluster ocs-storagecluster --type merge --patch '{
    "spec": {
        "resources": {
            "noobaa-endpoint": {
                "limits": {
                    "memory": "4Gi"
                },
                "requests": {
                    "memory": "4Gi"
                }
            }
        }
    }
}'

The Watson Discovery operator pod goes to CrashLoopBackOff status when a tethered project exists

Applies to: 5.3.0

Fixed in: 5.3.1

Error

When you install Watson Discovery in an environment with a tethered namespace, the operator pod goes to a CrashLoopBackOff state.

oc get pod -l icpdsupport/addOnId=discovery,icpdsupport/app=operator -n <operator_namespace>
wd-discovery-operator-7df77755d4-647dp                           0/1     CrashLoopBackOff   21 (3m1s ago)    100m
Solution
Apply the following patch to the Watson Discovery operator deployment. Replace all <operator_namespace> and <operand_namespace> with the appropriate values for your environment:
oc patch deployment wd-discovery-operator -n <operator_namespace> --type='strategic' -p='{"spec":{"template":{"spec":{"containers":[{"name":"manager","env":[{"name":"WATCH_NAMESPACE","value":"<operator_namespace>,<operand_namespace>","valueFrom":null}]}]}}}}'

The Watson Discovery operator pod goes to CrashLoopBackOff status in GitOps with Argo CD install

Applies to: 5.3.0

Error

When you install Watson Discovery in GitOps with Argo CD, the operator pod goes to CrashLoopBackOff.

oc get pod -l icpdsupport/addOnId=discovery,icpdsupport/app=operator -n ${operatorNS}
wd-discovery-operator-7df77755d4-647dp                           0/1     CrashLoopBackOff   21 (3m1s ago)    100m
Solution
To resolve this issue, complete the following steps:
  1. Apply the following patch to the namespaced Watson Discovery application.
    oc patch application.argoproj.io "watson-discovery${appSuffix}" -n ${argocdNS} --type=merge -p '{"spec":{"source":{"helm":{"valuesObject":{"watsonDiscovery":{"operator":{"watchNamespaces":["'"${operatorNS}"'","'"${instanceNS}"'"]}}}}}}}'
  2. Sync the Watson Discovery application in the Argo CD UI or run the following command:
    argocd app sync "watson-discovery${appSuffix}"

Error is displayed after applying the temporary patch

Applies to: 5.3.0

Error

After applying the temporary patch, an error is displayed.

oc get temporarypatches.oppy.ibm.com
NAME              READY   READYREASON   UPDATING   UPDATINGREASON   DEPLOYED   VERIFIED   AGE
temporary-patch   False   Errored       False      Errored          1/1        1/1        4s
After a while, an error is displayed in the status as well.
oc get wd
NAME   VERSION   READY   READYREASON   UPDATING   UPDATINGREASON   DEPLOYED   VERIFIED   QUIESCE        DATASTOREQUIESCE   AGE
wd     5.2.1     False   ConfigError   False      Errored          24/24      24/24      NOT_QUIESCED   NOT_QUIESCED       11h
Cause
This issue is caused when the Watson Discovery operator deletes the spec field of the temporary patch.
Solution
Apply the same patch again to fill the spec field. Then, delete the Watson Discovery operator to restart it.
oc delete pod -l icpdsupport/addOnId=discovery,icpdsupport/app=operator -n ${PROJECT_CPD_INST_OPERATORS}

During shutdown the DATASTOREQUIESCE field does not update

Applies to: 5.3.0
Error
After you run the cpd-cli manage shutdown command, the DATASTOREQUIESCE state in the Watson Discovery resource is stuck in QUIESCING.
The shutdown command completes successfully. However, when you check the status of the WatsonDiscovery wd custom resource (oc get WatsonDiscovery wd -n "${PROJECT_CPD_INST_OPERANDS}"), the command returns:
NAME   VERSION   READY   READYREASON   UPDATING   UPDATINGREASON   DEPLOYED   VERIFIED   QUIESCE    DATASTOREQUIESCE   AGE
wd     4.7.3     True    Stable        False      Stable           24/24      24/24      QUIESCED   QUIESCING       16h
Cause
Due to the way quiescing Postgres works, the Postgres pods are still running in background. This results in the metadata not updating in the Watson Discovery resource.
Solution
There is no fix for this. However, the state being stuck in QUIESCING does not affect the Watson Discovery operator.

Upgrade fails due to existing Elasticsearch 6.x indices

Applies to: 5.3.0

Error
If the existing Elasticsearch cluster has indices created with Elasticsearch 6.x, then upgrading Watson Discovery to Version 5.0.0 and later fails.
> oc get wd wd
NAME   VERSION   READY   READYREASON   UPDATING   UPDATINGREASON   DEPLOYED   VERIFIED   QUIESCE        DATASTOREQUIESCE   AGE
wd     4.8.0     False   InProgress    True       VerifyWait       2/24       1/24       NOT_QUIESCED   NOT_QUIESCED       63m
Cause
Watson Discovery checks for existence of deprecated version of indices in the Elasticsearch cluster when upgrading to Version 5.0.0 and later.
Solution
To determine whether existing Elasticsearch 6.x indices are the cause of the upgrade failure, verify the log of the wd-discovery-es-detect-index pod using the following command:
> oc logs -l app=es-detect-index --tail=-1
If an Elasticsearch 6.x index is found, the following content is displayed in the log:
> oc logs -l app=es-detect-index --tail=-1
Checking connection to Elastic endpoint
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   569  100   569    0     0  28450      0 --:--:-- --:-{:-- --:--:--     0
  "name" : "wd-ibm-elasticsearch-es-server-client-0",
  "cluster_name" : "es-cluster",
  "cluster_uuid" : "XHm71iR_REu0VzbM16BRgg",
  "version" : {
    "number" : "7.10.2-SNAPSHOT",
    "build_flavor" : "oss",
    "build_type" : "tar",
    "build_hash" : "747e1cc71def077253878a59143c1f785afa92b9",
    "build_date" : "2023-10-22T21:59:42.077083382Z",
    "build_snapshot" : true,
    "lucene_version" : "8.7.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}
-:-- --:--:-- 28450
Retrieve list of indexes
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   357  100   357    0     0   2811      0 --:--:-- --:--:-- --:--:--  2811
Checking for ElasticSearch 6 index
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  2582  100  2582    0     0  95629      0 --:--:-- --:--:-- --:--:-- 95629
ElasticSearch 6 index found. Failing job

To upgrade, you must reindex all Elasticsearch 6.x indices to Elasticsearch 7.x indices by running a script.

To reindex from Elasticsearch 6.x to Elasticsearch 7.x, complete the following steps:
  1. Go to the watson-developer-cloud/doc-tutorial-downloads GitHub repository and download the reindex_es6_indices.sh script.
  2. Make the script an executable file.
    > chmod +x ./reindex_es6_indices.sh
  3. Copy the script from your local directory to the wd-ibm-elasticsearch-es-server-data-0 pod of the cluster.
    > oc cp -c elasticsearch ./reindex_es6_indices.sh wd-ibm-elasticsearch-es-server-data-0:/tmp/ 
  4. Use the exec command for the wd-ibm-elasticsearch-es-server-data-0 pod and run the script to reindex.
    > oc exec -c elasticsearch  wd-ibm-elasticsearch-es-server-data-0 -- bash -c "/tmp/reindex_es6_indices.sh"
    After reindexing is successful, the following content is displayed in the log:
    > oc exec -c elasticsearch  wd-ibm-elasticsearch-es-server-data-0 -- bash -c "/tmp/reindex_es6_indices.sh"
    Checking status of ElasticSearch
    Getting index list
    Total number of indices: 245
    [1 / 245] ElasticSearch 6 index found: 6604fc6b-c82c-4a7e-8062-fec9e74cc88f_curations
    ----------------------------
    Updating index - 6604fc6b-c82c-4a7e-8062-fec9e74cc88f_curations ...
    Generating new settings
    Removing unnecessary settings
    Getting mappings
    Remove existing index : 6604fc6b-c82c-4a7e-8062-fec9e74cc88f_curations_new
    Removing index 6604fc6b-c82c-4a7e-8062-fec9e74cc88f_curations_new
    {"acknowledged":true}
    Creating new index 6604fc6b-c82c-4a7e-8062-fec9e74cc88f_curations_new
    {"acknowledged":true,"shards_acknowledged":true,"index":"6604fc6b-c82c-4a7e-8062-fec9e74cc88f_curations_new"}
    Executing reindex index to 6604fc6b-c82c-4a7e-8062-fec9e74cc88f_curations_new
    Reindex task ID: MF8B0SsSSXWZwPYnS4wxCQ:225874
    Reindexed: 6604fc6b-c82c-4a7e-8062-fec9e74cc88f_curations_new
    Removing index 6604fc6b-c82c-4a7e-8062-fec9e74cc88f_curations
    {"acknowledged":true}
    Setting index 6604fc6b-c82c-4a7e-8062-fec9e74cc88f_curations_new to read-only
    {"acknowledged":true}
    Renaming index from 6604fc6b-c82c-4a7e-8062-fec9e74cc88f_curations_new to 6604fc6b-c82c-4a7e-8062-fec9e74cc88f_curations
    {"acknowledged":true,"shards_acknowledged":true,"index":"6604fc6b-c82c-4a7e-8062-fec9e74cc88f_curations"}
    Unsetting index 6604fc6b-c82c-4a7e-8062-fec9e74cc88f_curations_new to read-only
    {"acknowledged":true}
    Unsetting index 6604fc6b-c82c-4a7e-8062-fec9e74cc88f_curations to read-only
    {"acknowledged":true}
    Removing index 6604fc6b-c82c-4a7e-8062-fec9e74cc88f_curations_new
    {"acknowledged":true}
    ----------------------------
    [2 / 245] ElasticSearch 6 index found: ecadd1ee-d025-845b-0000-017b2c281668_notice
    ...
    Completed!
    After the Elasticsearch 6.x indices are reindexed to Elasticsearch 7.x indices, the upgrade should continue and finish successfully.
    > oc get wd
    NAME   VERSION   READY   READYREASON   UPDATING   UPDATINGREASON   DEPLOYED   VERIFIED   QUIESCE        DATASTOREQUIESCE   AGE
    wd     4.8.0     True    Stable        False      Stable           24/24      24/24      NOT_QUIESCED   NOT_QUIESCED       82m
    
Contact IBM Support if the Elasticsearch cluster or reindexing to Elasticsearch 7.x fails, such as in the following cases:
  • When checking the logs of the wd-discovery-es-detect-index pod, if indices other than Elasticsearch 6.x or Elasticsearch 7.x are found, the following content is displayed in the log:
    > oc logs -l app=es-detect-index --tail=-1
    Checking connection to Elastic endpoint
      % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                     Dload  Upload   Total   Spent    Left  Speed
    100   569  100   569    0     0  28450      0 --:--:-- --:-{:-- --:--:--     0
      "name" : "wd-ibm-elasticsearch-es-server-client-0",
      "cluster_name" : "es-cluster",
      "cluster_uuid" : "XHm71iR_REu0VzbM16BRgg",
      "version" : {
        "number" : "7.10.2-SNAPSHOT",
        "build_flavor" : "oss",
        "build_type" : "tar",
        "build_hash" : "747e1cc71def077253878a59143c1f785afa92b9",
        "build_date" : "2023-10-22T21:59:42.077083382Z",
        "build_snapshot" : true,
        "lucene_version" : "8.7.0",
        "minimum_wire_compatibility_version" : "6.8.0",
        "minimum_index_compatibility_version" : "6.0.0-beta1"
      },
      "tagline" : "You Know, for Search"
    }
    -:-- --:--:-- 28450
    Retrieve list of indexes
      % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                     Dload  Upload   Total   Spent    Left  Speed
    100   357  100   357    0     0   2811      0 --:--:-- --:--:-- --:--:--  2811
    Checking for ElasticSearch 6 index
      % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                     Dload  Upload   Total   Spent    Left  Speed
    100  2582  100  2582    0     0  95629      0 --:--:-- --:--:-- --:--:-- 95629
    Unidentified index found. Please verify
  • When checking the logs of the wd-discovery-es-detect-index pod, if a connection to the Elasticsearch cluster is not established, the following content is displayed in the log:
    > oc logs -l app=es-detect-index --tail=-1
    Checking connection to Elastic endpoint
      % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                     Dload  Upload   Total   Spent    Left  Speed
      0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (7) Failed to connect to wd-ibm-elasticsearch-srv.zen port 443: Connection refused
    Unable to connect. Please check Elastic
  • When reindexing starts, but is unsuccessful, the following content is displayed in the log:
    > oc exec -c elasticsearch  wd-ibm-elasticsearch-es-server-data-0 -- bash -c "/tmp/reindex_es6_indices.sh"
    Checking status of ElasticSearch
    Getting index list
    Total number of indices: 247
    [1 / 247] ElasticSearch 6 index found: 6604fc6b-c82c-4a7e-8062-fec9e74cc88f_curations
    ...
    [49 / 247] ElasticSearch 6 index found: ecadd1ee-d025-845b-0000-017a3722ef7f
    ----------------------------
    Updating index - ecadd1ee-d025-845b-0000-017a3722ef7f ...
    Generating new settings
    Removing unnecessary settings
    Getting mappings
    Remove existing index : ecadd1ee-d025-845b-0000-017a3722ef7f_new
    Removing index ecadd1ee-d025-845b-0000-017a3722ef7f_new
    {"acknowledged":true}
    Creating new index ecadd1ee-d025-845b-0000-017a3722ef7f_new
    {"acknowledged":true,"shards_acknowledged":true,"index":"ecadd1ee-d025-845b-0000-017a3722ef7f_new"}
    Executing reindex index to ecadd1ee-d025-845b-0000-017a3722ef7f_new
    Reindex task ID: MF8B0SsSSXWZwPYnS4wxCQ:182680
    In Progress: reindex from [ecadd1ee-d025-845b-0000-017a3722ef7f] to [ecadd1ee-d025-845b-0000-017a3722ef7f_new][_doc]
    In Progress: reindex from [ecadd1ee-d025-845b-0000-017a3722ef7f] to [ecadd1ee-d025-845b-0000-017a3722ef7f_new][_doc]
    Failed to reindex: ecadd1ee-d025-845b-0000-017a3722ef7f_new
    {
      "took": 299943,
      "timed_out": false,
      "total": 110237,
      "updated": 0,
      "created": 48998,
      "deleted": 0,
      "batches": 49,
      "version_conflicts": 0,
      "noops": 0,
      "retries": {
        "bulk": 0,
        "search": 0
      },
      "throttled": "0s",
      "throttled_millis": 0,
      "requests_per_second": -1.0,
      "throttled_until": "0s",
      "throttled_until_millis": 0,
      "failures": [
        {
          "index": "ecadd1ee-d025-845b-0000-017a3722ef7f_new",
          "type": "_doc",
          "id": "bc670579c33c9d2644dceef7ac94c249b96c568a9e79b0d1e6bbe2349ae371f9",
          "cause": {
            "type": "mapper_parsing_exception",
            "reason": "failed to parse",
            "caused_by": {
              "type": "stream_constraints_exception",
              "reason": "String length (5046272) exceeds the maximum length (5000000)"
            }
          },
          "status": 400
        },
        {
          "index": "ecadd1ee-d025-845b-0000-017a3722ef7f_new",
          "type": "_doc",
          "id": "8f4a27f149a93fead6852695290cc079635ea8a1d190616adcb8bfdafba09450",
          "cause": {
            "type": "mapper_parsing_exception",
            "reason": "failed to parse",
            "caused_by": {
              "type": "stream_constraints_exception",
              "reason": "String length (5046272) exceeds the maximum length (5000000)"
            }
          },
          "status": 400
        }
      ]
    }
    Error: Please contact support. Do not run this scripts again.
    command terminated with exit code 1

UpgradeError is shown after resizing PVC

Applies to: 5.3.0

Error
After you edit the custom resource to change the size of a persistent volume claim for a data store, an error is shown.
Cause
You cannot change the persistent volume claim size of a component by updating the custom resource. Instead, you must change the size of the PVC on the persistent volume claim node after it is created.
Solution
To prevent the error, undo the changes that were made to the YAML file. For more information about the steps to follow to change the persistent volume claim size successfully, see Scaling an existing persistent volume claim size.

Disruption of service after upgrading, restarting, or scaling by updating scaleConfig

Applies to: 5.3.0

Error
After upgrading, restarting, or scaling Watson Discovery by updating the scaleConfig parameter, the Elasticsearch component might become non-functional, resulting in disruption of service and data loss.
Cause
The Elasticsearch component uses a quorum of pods to ensure availability when it completes search operations. However, each pod in the quorum must recognize the same pod as the leader of the quorum. The system can run into issues when more than one leader pod is identified.
Solution
To determine if confusion about the quorum leader pod is the cause of the issue, complete the following steps:
  1. Log in to the cluster, and then set the namespace to the project where the Discovery resources are installed.
  2. Check each of the Elasticsearch pod with the role of master to see which pod it identifies as the quorum leader.
    oc get pod -l icpdsupport/addOnId=discovery,app=elastic,role=master,tenant=wd \
    -o jsonpath='{range .items[*]}{.metadata.name}{"\n"}{end}'  | while read i; do echo $i; oc exec $i \
    -c elasticsearch -- bash -c 'curl -ksS "localhost:19200/_cat/master?v"'; echo; done
    
    Each pod must list the same pod as the leader.
    For example, in the following result, two different leaders are identified. Pods 1 and 2 identify pod 2 as the leader. However, pod 0 identifies itself as the leader.
    wd-ibm-elasticsearch-es-server-master-0
    id                     host      ip        node
    7q0kyXJkSJirUMTDPIuOHA 127.0.0.1 127.0.0.1 wd-ibm-elasticsearch-es-server-master-0
    
    wd-ibm-elasticsearch-es-server-master-1
    id                     host      ip        node
    L0mqDts7Rh6HiB0aQ4LLtg 127.0.0.1 127.0.0.1 wd-ibm-elasticsearch-es-server-master-2
    
    wd-ibm-elasticsearch-es-server-master-2
    id                     host      ip        node
    L0mqDts7Rh6HiB0aQ4LLtg 127.0.0.1 127.0.0.1 wd-ibm-elasticsearch-es-server-master-2

If you find that more than one pod is identified as the leader, contact IBM Support.

Limitations

The following limitations apply to the Watson Discovery service:
  • Formulas that are embedded as images, especially those containing division bars (horizontal fractions) or other complex notations, are not reliably recognized or extracted by Watson Discovery. As a result, these formulas might be omitted, misinterpreted, or rendered incorrectly in the extracted output. This limitation stems from how the SDU pipeline handles embedded images, and currently affects all versions of Watson Discovery that use SDU.
  • The service supports single-zone deployments; it does not support multi-zone deployments.
  • You cannot upgrade the Watson Discovery service by using the service-instance upgrade command from the Cloud Pak for Data command-line interface.
  • You cannot use the Cloud Pak for Data OpenShift® APIs for Data Protection (OADP) backup and restore utility to do an offline backup and restore the Watson Discovery service. Online backup and restore with OADP is available.