Limitations and known issues in Watson Discovery
The following limitations and known issues apply to the Watson Discovery service.
Refresh 9 of Cloud Pak for Data Version 4.0
Operand version: 4.0.9
- Discovery generates a partial failure status message for the Cloud Pak for Data OADP backup and restore utility.
-
- Error: When you check the status of the OADP backup utility after using it to backup a cluster
where Discovery is installed, a
Phase: PartiallyFailedmessage is displayed. One or more Discovery components are included in theFailedlist. - Cause: Discovery cannot be backed up and restored by using the OADP backup and restore utility. When the Discovery service is present, and an administrator backs up an entire Cloud Pak for Data instance, a status message is displayed that indicates a partial failure. This status is displayed because the persistent volume claims (PVCs) for Discovery are not backed up. However, the message does not impact the back up of the rest of the services.
- Solution: No action is required to resolve the status message. You can remove the persistent volume claims that are associated with the Discovery service separately. After using the scripts to back up your Discovery service data, you can follow the step that is documented in the uninstall instructions for the Discovery service to delete the PVCs. For more information about how to remove the PVC associated with Discovery, see Uninstalling the Discovery service.
This issue exists in refresh versions 8 through 9.
- Error: When you check the status of the OADP backup utility after using it to backup a cluster
where Discovery is installed, a
- Machine configuration pool is stuck because it cannot evict a pod
-
- Error: During an action that requires nodes to be drained, such as an upgrade, the
machine configuration pool reports that scheduling is disabled for one of the worker nodes. If you
debug further, you learn that one node is unscheduled because the system
failed to drain node. - Cause: Watson Discovery uses a single
instance of EDB PostgreSQL on Starter installations. If you have a Starter deployment type and did
not define a maintenance window, this error can occur because the primary PostgreSQL pod is
protected by a
PodDisruptionBudgetconfiguration setting. As a result, the pod cannot be evicted automatically during upgrade. - Solution: Set the PostrgreSQL pod to maintenance mode before you upgrade the service.
Note: The service is unavailable while the pod is in maintenance mode.Complete the following steps:
- Make sure that the PostgreSQL pods on the PostgreSQL cluster are operational and healthy. To
check, enter the following command:
Also check that the PostgreSQL pods are running and do not keep restarting when you enter the following command:oc cnp status wd-discovery-cn-postgresoc get pods | grep wd-discovery-cn-postgres - Set the PostgreSQL service in maintenance mode with the following
command:
oc patch WatsonDiscovery wd --type=merge \ --patch='{"spec":{"postgres":{"quiesce":{"enabled": true}}}}' - Verify that the PostgreSQL service is in maintenance mode. The following command must return
true.oc get cluster wd-discovery-cn-postgres \ -o jsonpath='{.spec.nodeMaintenanceWindow.inProgress}{"\n"}' - Perform the action that requires the nodes to be drained, one at a time, such as an
upgrade.If, while draining, a node cannot evict a PostgreSQL pod that is running on it, delete the respective PostgreSQL pod by using the following command:
Check that the deleted PostgreSQL pod gets created in another worker-node that is up and running. If this newly created PostgreSQL pod is created on a node that is yet to be drained, you might need to do delete the pod again when the time comes for the node to be drained.oc delete pod wd-discovery-cn-postgres-<n> - Revert change to the PostgreSQL service to take it out of maintenance
mode.
oc patch WatsonDiscovery wd --type=merge \ --patch='{"spec":{"postgres":{"quiesce":{"enabled": false}}}}' - Verify that the PostgreSQL service is out of maintenance mode. The following command must return
false.oc get cluster wd-discovery-cn-postgres \ -o jsonpath='{.spec.nodeMaintenanceWindow.inProgress}{"\n"}'
- Make sure that the PostgreSQL pods on the PostgreSQL cluster are operational and healthy. To
check, enter the following command:
This issue exists in all refresh versions.
- Error: During an action that requires nodes to be drained, such as an upgrade, the
machine configuration pool reports that scheduling is disabled for one of the worker nodes. If you
debug further, you learn that one node is unscheduled because the system
- Error when quiescing the data stores
-
- Error: When quiescing the data stores, the following error is
displayed:
The task includes an option with an undefined variable. The error was: `ScheduleName` is undefined. - Cause: The operator cannot change or set the value of the
schedulerNamefield because, although it exists in theetcdclusterandStatefulset/wd-discovery-etcdpods, it is not explicitly defined. - Solution: Apply a patch that defines the
schedulerNamefield so that the operator can set or change the value of the field successfully.Important: When you apply this patch, any runningRun the following command to apply the patch:etcdpods are restarted.oc patch etcdclusters wd-discovery-etcd --type merge \ -p '{"spec":{"schedulerName":"default-scheduler"}}'
This issue exists in all refresh versions.
- Error: When quiescing the data stores, the following error is
displayed:
Refresh 8 of Cloud Pak for Data Version 4.0
Operand version: 4.0.8
- The
wd-discovery-multi-tenant-migrationjob fails if anyone besides a system administrator performs the migration. -
- Error: When you upgrade to version 4.0.8 with a user ID other than
admin, the migration job fails. - Cause: The migration script assumes that the script is run by a user with the
adminuser ID. - Solution: Apply a patch that allows the migration to be successful. Complete the
following steps:
- From the Cloud Pak for Data web client, get the user ID of the owner of the instance that you want to upgrade.
- Download the wd-migration-uid-patch.zip patch file from the Watson Developer Cloud GitHub repository.
- Extract the wd-migration-uid-patch.yaml file from the archive file, and then open it in a text editor.
- Replace the
<user_id>variable with the user ID of the owner of the instance that you want to upgrade. - Run the following command in a terminal that is logged in to the
cluster:
oc create -f wd-migration-uid-patch.yaml - Delete the previous migration job by using the following
command:
oc delete job wd-discovery-multi-tenant-migration
This issue exists in refresh versions 6 through 8.
- Error: When you upgrade to version 4.0.8 with a user ID other than
- Discovery generates a partial failure status message for the Cloud Pak for Data OADP backup and restore utility.
-
- Error: When you check the status of the OADP backup utility after using it to backup a cluster
where Discovery is installed, a
Phase: PartiallyFailedmessage is displayed. One or more Discovery components are included in theFailedlist. - Cause: Discovery cannot be backed up and restored by using the OADP backup and restore utility. When the Discovery service is present, and an administrator backs up an entire Cloud Pak for Data instance, a status message is displayed that indicates a partial failure. This status is displayed because the persistent volume claims (PVCs) for Discovery are not backed up. However, the message does not impact the back up of the rest of the services.
- Solution: No action is required to resolve the status message. You can remove the persistent volume claims that are associated with the Discovery service separately. After using the scripts to back up your Discovery service data, you can follow the step that is documented in the uninstall instructions for the Discovery service to delete the PVCs. For more information about how to remove the PVC associated with Discovery, see Uninstalling the Discovery service.
This issue exists in refresh versions 8 through 9.
- Error: When you check the status of the OADP backup utility after using it to backup a cluster
where Discovery is installed, a
Refresh 7 of Cloud Pak for Data Version 4.0
Operand version: 4.0.7
- The
Deployedstatus of resources fluctuates after the 4.0.7 upgrade is completed. -
- Error: When you check the status by submitting the
oc get WatsonDiscoverycommand, the ready status of the resources toggles between showing23/23and20/23components as being ready for use. - Cause: The readiness state of the resources is not reported consistently after a migration.
- Solution: To manually refresh the status information, run the following commands in a
terminal that is logged in to the
cluster:
# Creates a proxy server between localhost and the Kubernetes API server and runs in the background oc proxy & # Clear the status from the WatsonDiscovery Operand (<namespace> must be set to the namespace where Discovery is installed) curl -ksS -X PATCH -H "Accept: application/json, */*" \ -H "Content-Type: application/merge-patch+json" \ http://127.0.0.1:8001/apis/discovery.watson.ibm.com/v1/namespaces/<namespace>/watsondiscoveries/wd/status \ --data '{"status": null}'
- Error: When you check the status by submitting the
- Discovery generates an error in the Cloud Pak for Data OADP backup and restore utility.
-
- Error: The utility does not complete successfully and the following message is written to
the log:
preBackupViaConfigHookRule on backupconfig/watson-discovery in namespace cpd (status=error). - Cause: Discovery cannot be backed up and restored by using the OADP backup and restore utility. When the Discovery service is present, and an administrator attempts to backup an entire Cloud Pak for Data instance, Discovery prevents the utility from completing successfully.
- Solution: Apply a patch that stops Discovery from preventing the utility from completing
successfully. To apply the patch, complete the following steps:
- Download the wd-aux-br-patch.zip file from the Watson Developer Cloud Github repository.
- Extract the wd-aux-br-patch.yaml file from the ZIP file.
- Run the following command in a terminal that is logged in to the
cluster:
oc create -f wd-aux-br-patch.yaml
This issue exists in refresh versions 2 through 7.
- Error: The utility does not complete successfully and the following message is written to
the log:
- The
wd-discovery-multi-tenant-migrationjob fails if anyone besides a system administrator performs the migration. -
- Error: When you upgrade to version 4.0.8 with a user ID other than
admin, the migration job fails. - Cause: The migration script assumes that the script is run by a user with the
adminuser ID. - Solution: Apply a patch that allows the migration to be successful. Complete the
following steps:
- From the Cloud Pak for Data web client, get the user ID of the owner of the instance that you want to upgrade.
- Download the wd-migration-uid-patch.zip patch file from the Watson Developer Cloud GitHub repository.
- Extract the wd-migration-uid-patch.yaml file from the archive file, and then open it in a text editor.
- Replace the
<user_id>variable with the user ID of the owner of the instance that you want to upgrade. - Run the following command in a terminal that is logged in to the
cluster:
oc create -f wd-migration-uid-patch.yaml - Delete the previous migration job by using the following
command:
oc delete job wd-discovery-multi-tenant-migration
- Error: When you upgrade to version 4.0.8 with a user ID other than
Refresh 6 of Cloud Pak for Data Version 4.0
Operand version: 4.0.6
- Upgrade to 4.0.6 fails if no Discovery instance is provisioned in the existing cluster before you begin the upgrade process.
-
- Error: The 4.0.6 upgrade process assumes that a Watson Discovery instance is provisioned in the existing cluster. For example, if you are upgrading from 4.0.5 to 4.0.6, you must have an instance provisioned in the 4.0.5 cluster before you begin the migration.
- Cause: The current code returns an error when no instance exists because it cannot find a document index to migrate.
- Solution: Verify that an instance of Watson Discovery has been provisioned in the existing Cloud Pak for Data cluster before you start the upgrade to 4.0.6. If you tried to upgrade to 4.0.6, but no instances were provisioned and the migration failed, remove the existing installation and install 4.0.6 from scratch.
- The
Deployedstatus of resources fluctuates after the 4.0.7 upgrade is completed. -
- Error: When you check the status by submitting the
oc get WatsonDiscoverycommand, the ready status of the resources toggles between showing23/23and20/23components as being ready for use. - Cause: The readiness state of the resources is not reported consistently after a migration.
- Solution: Typically, the instance is ready for use despite the ready state instability.
The ready state settles after approximately 5 hours. You can wait for the readiness state to
consistently show
23/23or you can manually refresh the status information by running the following commands in a terminal that is logged in to the cluster:# Creates a proxy server between localhost and the Kubernetes API server and runs in the background oc proxy & # Clear the status from the WatsonDiscovery Operand (<namespace> must be set to the namespace where Discovery is installed) curl -ksS -X PATCH -H "Accept: application/json, */*" \ -H "Content-Type: application/merge-patch+json" \ http://127.0.0.1:8001/apis/discovery.watson.ibm.com/v1/namespaces/<namespace>/watsondiscoveries/wd/status \ --data '{"status": null}'
- Error: When you check the status by submitting the
- The
wd-discovery-multi-tenant-migrationjob fails if anyone besides a system administrator performs the migration. -
- Error: When you upgrade to version 4.0.8 with a user ID other than
admin, the migration job fails. - Cause: The migration script assumes that the script is run by a user with the
adminuser ID. - Solution: Apply a patch that allows the migration to be successful. Complete the
following steps:
- From the Cloud Pak for Data web client, get the user ID of the owner of the instance that you want to upgrade.
- Download the wd-migration-uid-patch.zip patch file from the Watson Developer Cloud GitHub repository.
- Extract the wd-migration-uid-patch.yaml file from the archive file, and then open it in a text editor.
- Replace the
<user_id>variable with the user ID of the owner of the instance that you want to upgrade. - Run the following command in a terminal that is logged in to the
cluster:
oc create -f wd-migration-uid-patch.yaml - Delete the previous migration job by using the following
command:
oc delete job wd-discovery-multi-tenant-migration
- Error: When you upgrade to version 4.0.8 with a user ID other than
For more information about earlier releases, see Known issues in the product documentation.