Upgrading a Watson Query instance that uses the ocs-storagecluster-cephfs storage class delays the upgrade

Important: IBM Cloud Pak® for Data Version 4.8 will reach end of support (EOS) on 31 July, 2025. For more information, see the Discontinuance of service announcement for IBM Cloud Pak for Data Version 4.X.

Upgrade to IBM Software Hub Version 5.1 before IBM Cloud Pak for Data Version 4.8 reaches end of support. For more information, see Upgrading from IBM Cloud Pak for Data Version 4.8 to IBM Software Hub Version 5.1.

When you upgrade an instance of Watson Query that uses the ocs-storagecluster-cephfs storage class, the upgrade is delayed for an extended period.

Symptoms

The service instance upgrade might experience a long delay when you try to upgrade an instance that has been provisioned with the ocs-storagecluster-cephfs storage class during the provisioning time.
Note: You won't encounter this issue if you provision a Watson Query instance using the ocs-storagecluster-ceph-rbd storage class for Node storage.

To ensure that you are using the appropriate storage requirements and storage class types when provisioning future Watson Query instances, see Storage Requirements. For more information on creating a service instance, see Creating a service instance for Watson Query. For more information on upgrading the service instance, see Upgrading Watson Query.

Diagnosing the problem

The service instance upgrade might experience prolonged delays if it meets the following three conditions. Ensure that all the conditions are met before you attempt to resolve the problem. If the conditions are not met, then discontinue this task.
  1. Check whether the Watson Query instance is using the ocs-storagecluster-cephfs storage class by running the following command:
    oc get pvc | grep -i bigsql-c-db2u-dv-db2u | grep -i ocs-storagecluster-cephfs

    The command should return a result similar to the following example:

    [root@localhost ~]# oc get pvc | grep -i bigsql-c-db2u-dv-db2u | grep -i ocs-storagecluster-cephfs
    bigsql-c-db2u-dv-db2u-0                            Bound    pvc-bbb27dea-9487-4b59-ad4a-acc41256aa5c   50Gi       RWO            ocs-storagecluster-cephfs     66m
    bigsql-c-db2u-dv-db2u-1                            Bound    pvc-4ddf1b8e-9553-4897-800d-af333a6797e7   50Gi       RWO            ocs-storagecluster-cephfs     66m
    
    Note: If the command does not return any results, then discontinue this task because the Watson Query instance is not running against the ocs-storagecluster-cephfs.
  2. Check whether the Watson Query instance upgrade time exceeds 45 minutes after you run the cpd-cli service-instance upgrade command.
  3. Check whether the following command takes more than one minute to return the successful connection message:
    oc exec -it c-db2u-dv-db2u-0 -- su - db2inst1 -c "time db2 connect to bigsql

    The command should return a message similar to the following example:

    [root@localhost ~]# oc exec -it c-db2u-dv-db2u-0 -- su - db2inst1 -c "time db2 connect to bigsql"
    Defaulted container "db2u" out of: db2u, instdb (init), init-labels (init), init-kernel (init)
    Database Connection Information
    Database server = DB2/LINUXX8664 11.5.8.0
    SQL authorization ID = DB2INST1
    Local database alias = BIGSQL
    real 0m1.054s
    user 0m0.009s
    sys 0m0.028s

Resolving the problem

Restart the Watson Query head and worker pods by completing the following steps:
  1. Scale down the head and worker pods by running the following command:
    oc scale sts c-db2u-dv-db2u --replicas=0

    Wait for Kubernetes to terminate all the c-db2u-dv-db2u-xxx pods.

  2. Scale back the head and worker pods by running the following command:
    oc scale sts c-db2u-dv-db2u --replicas=2

After you restart the head and worker pods, the Watson Query upgrade should resume automatically.