Known issues for Db2 and Db2 Warehouse

The following limitations and known issues apply to the Db2® and Db2 Warehouse services.

Upgrade is stuck at UPGRADE_IN_PROGRESS

If the Db2 or Db2 Warehouse upgrade procedure is stuck at UPGRADE_IN_PROGRESS, but the Db2ucluster CR reports a Ready state, perform the workaround suggested below.

Symptoms
Find the deployment ID of the Db2 or Db2 Warehouse environment by navigating to the Details section of the file on the Databases page.
Run the following command to inspect the upgrade logs for the selected database:
kubectl logs -n ${PROJECT_CPD_INSTANCE} -c detect-vrmf-change c-${DEPLOYMENT_ID}-11.5.7.0-cn7-to-11.5.8.0-cn2-xxxxx
If you see a log similar to the following example, perform the workaround procedure:
grep: /db2u/tmp/env: No such file or directory
grep: /db2u/tmp/env: No such file or directory
awk: fatal: cannot open file `/.metadata/BASE_DB2_ENGINE/spec' for reading (No such file or directory)
Current Version: 11.5.7.0
New Version:
Workaround
  1. Run the following command inside the db2u-0 pod to prepare the database environment for the upgrade:
    kubectl exec -n ${PROJECT_CPD_INSTANCE} -it -n zen c-${DEPLOYMENT_ID}-db2u-0 -- bash -c 'sudo /db2u/scripts/detect_db2_vrmf_change.sh -file'
    
  2. Run the following command to upgrade for the Db2 service:
    kubectl exec -n ${PROJECT_CPD_INSTANCE} -it -n zen c-${DEPLOYMENT_ID}-db2u-0 -- /db2u/db2u_engn_update_or_upgrade_job.sh
    
Verification
Run the following command to retrieve the currently installed version of Db2 and the new version:
kubectl exec -n ${PROJECT_CPD_INSTANCE} -it -n zen c-${DEPLOYMENT_ID}-db2u-0 -- bash -c 'sudo /db2u/scripts/detect_db2_vrmf_change.sh -file'
Ensure that the Current version matches the New version in the output of the command. Refer to the following example:
Current Version: 11.5.8.0
New Version: 11.5.8.0

Db2 load utility failure after backup and restore procedure

Users might be unable to run the Db2 load utility after performing a backup and restore procedure, due to changes in user permissions. The following workaround will correct the ownership permission issues.

Workaround
  1. Run the exec command to access the head node of your instance:
    oc exec -it c--db2u-0
  2. Run the following command to remove the load/admin directory:
    rm -r /mnt/blumeta0/db2/load/admin
  3. Run the following command to assign the correct owner of the Db2 load utility:
    sudo chown db2uadm:db2iadm1 /mnt/blumeta0/db2/load
  4. Run the following command to provide the Db2 load utility permission to write to that directory:
    sudo chmod g+s /mnt/blumeta0/db2/load

Connect statement on Db2 Warehouse MPP hangs after manage_snapshots --action suspend

After running the manage_snapshots --action suspend command to suspend Db2 write operations, the db2_all db2 connect to dbname or manage_snapshots --action resume commands might hang.

The manage_snapshots --action suspend and manage_snapshots --action resume commands can be executed explicitly while performing a snapshot backup with Db2 Warehouse container commands or as part of Backing up and restoring an entire deployment.

The db2_all db2 connect to dbname command is executed in the manage_snapshots --action resume script.

Symptoms

manage_snapshots --action resume command hangs at connect to BLUDB:

oc exec -it c-db2wh-crd-mpp-2x2-separate-db2u-0 -- manage_snapshots --action resume
Defaulted container "db2u" out of: db2u, init-labels (init), init-kernel (init)

connect to BLUDB
Workaround
  1. Locate the catalog node pod:
    oc get po -l name=dashmpp-head-0
  2. Run the exec command as the db2instance user to access an interactive shell inside of the container:
    oc exec -it c-db2wh-1639609960917075-db2u-0 -- su - db2inst1
  3. Issue a db2 connect command to the database.
    db2 connect to BLUDB
  4. If the command hangs, repeat steps 1-3 in another terminal. When the connect command is successful, issue the manage_snapshots --action resume command:
    manage_snapshots --action resume
  5. When this command completes, the other hanging connections should be resolved.

Db2 Warehouse MPP pod stuck in initializing state during upgrade on Linux on IBM Z

When you upgrade the Db2 Warehouse service in a massively parallel processing (MPP) configuration on Linux® on IBM® Z from Cloud Pak for Data 3.5.x to 4.0.x, or 3.5.x to 4.5.x, one of the pods might remain in initializing state, not allowing the upgrade to complete.

The following example shows two pods, one in Running state and the other stuck in initializing (Init) state. This is the typical scenario for this issue.

c-db2wh-1631829603234900-db2u-0                                   1/1     Running     1          18h
c-db2wh-1631829603234900-db2u-1                                   0/1     Init:0/2    103        17h

To fix the problem, run the following command to delete the pod that is in Running state:

oc delete pod c-db2wh-1631829603234900-db2u-0

After you run the command, the upgrade process continues and completes, as the following example shows:

c-db2wh-1631829603234900-11.5.5.1-cn3-to-11.5.6.0-cn3-6mnsv	  0/1     Completed   0          5m35s
c-db2wh-1631829603234900-db2u-0                                   1/1     Running     0          7m42s
c-db2wh-1631829603234900-db2u-1                                   1/1     Running     0          19h

Issues when creating a Db2 connection with Cloud Pak for Data credentials

When you create a Db2 connection in the web console, an error can occur if you check the Cloud Pak for Data authentication box. To work around this issue, enter your Cloud Pak for Data credentials in the User name and Password fields, and do not check the Cloud Pak for Data authentication box.

Db2 Warehouse and Db2 pods not redeployed after node shutdown

When a Red Hat® OpenShift® node enters an Unreachable state, the Kubernetes scheduler evicts all pods on that node except for pods that are controlled by StatefulSets. For those StatefulSet-controlled pods, Kubernetes waits for up to five minutes by default (defined by the pod-eviction-timeout option in the kubelet configuration) before evicting so that the node has a chance to recover.

Db2 post restore hook fails during restore operation 1

Symptoms
The backup log indicates the following message:
...
time=2022-06-06T11:00:28.035568Z level=info msg=   status: partially_succeeded
time=2022-06-06T11:00:28.035572Z level=info msg=   nOpResults: 70
time=2022-06-06T11:00:28.035585Z level=info msg=   postRestoreViaConfigHookRule on restoreconfig/analyticsengine-br in namespace wkc (status=succeeded)
time=2022-06-06T11:00:28.035589Z level=info msg=   postRestoreViaConfigHookRule on restoreconfig/lite in namespace wkc (status=succeeded)
time=2022-06-06T11:00:28.035593Z level=info msg=   postRestoreViaConfigHookRule on restoreconfig/db2u in namespace wkc (status=error)
...
time=2022-06-06T11:00:28.035601Z level=info msg=   postRestoreViaConfigHookJob on restoreconfig/wkc in namespace wkc (status=timedout)
...
Either db2u pod c-db2oltp-iis-db2u or c-db2oltp-wkc-db2u does not progress beyond:
....
+ db2licm_cmd=/mnt/blumeta0/home/db2inst1/sqllib/adm/db2licm
+ /mnt/blumeta0/home/db2inst1/sqllib/adm/db2licm -a /db2u/license/db2u-lic
Resolution

Delete the affected db2u pods and then check that the pods are up and running.

oc get pod | grep -E "c-db2oltp-iis-db2u|c-db2oltp-wkc-db2u"

Run the post restore hook again.

cpd-cli oadp restore posthooks --include-namespaces wkc --log-level=debug --verbose

Db2 post restore hook fails during restore operation 2

Symptoms
The restore log indicates the following message:
...
* ERROR: Database could not be activated
Failed to restart write resume and/or active database
...
Resolution

Delete the affected db2u pods and then check that the pods are up and running.

oc get pod | grep -E "c-db2oltp-iis-db2u|c-db2oltp-wkc-db2u"

Run the post restore hook again.

cpd-cli oadp restore posthooks --include-namespaces wkc --log-level=debug --verbose

Db2u-operator pod errors when you upgrade from 3.5.x to 4.5.x or from 4.0.x to 4.5.x (specialized)

During the upgrade process from 3.5.x to 4.5.x or from 4.0.x to 4.5.x (specialized) by using cpd-cli, missing permissions or version errors might be returned. For more information, see Db2u-operator pod errors when you upgrade from 3.5.x to 4.5.x or from 4.0.x to 4.5.x (specialized) in Db2.

Add "NO CACHE ORDER" when creating IBMQREP_APPLYCMD control tables

Problem
In the Cloud Pak for Data version 4.6 environment, the QApply failed to start with the error message: Error attempting to stop apply queue.
Cause
NO CACHE ORDER is missing in the IBMQREP_APPLYCMD control table.
Solution
Add the required INCREMENT BY 1 NO CACHE ORDER into the IBMQREP_APPLYCMD control table. Example:
CREATE TABLE "QASN1".IBMQREP_APPLYCMD                             
 ( CMD_ID              BIGINT       NOT NULL                      
      GENERATED ALWAYS AS IDENTITY                                
      (START WITH 1,  INCREMENT BY 1 NO CACHE ORDER ),              
   CMD_INPUT_TIME      TIMESTAMP(0) NOT NULL WITH DEFAULT  ,      
   CMD_ORIGINATOR      VARCHAR(30)  NOT NULL WITH DEFAULT 'USER', 
   CMD_TEXT            VARCHAR(1024)    NOT NULL,                 
   CMD_STATE           CHAR(1)      NOT NULL WITH DEFAULT 'P',    
   CMD_RESULT_MSG      VARCHAR(1024)   ,                          
   CMD_STATE_TIME      TIMESTAMP(6) NOT NULL WITH DEFAULT  ,      
   CMD_RESULT_RC       INTEGER                                    
 ) VOLATILE CARDINALITY  IN AASN1DB.APPTSR;

Replication set creation fails with a SQL0204N error

Applies to: 4.6

Problem
The creation of a replication set fails with a red icon and Error, and the server log files in /mnt/blumeta0/bludr/logs/asnclp display a SQL0204N error.
ALTER TABLE "GDM_STG "."ACCRL_DEFERL" ADD CONSTRAINT "ACCRL_DEFERL_CNTRL_FK" FOREIGN KEY ("RUN_ID", "RUN_TYP_CD", "GEO_CD") REFERENCES "GDM_STG "."ACCRL_DEFERL_CNTRL" ("RUN_ID", "RUN_TYP_CD", "GEO_CD") ON DELETE RESTRICT ON UPDATE NO ACTION NOT ENFORCED TRUSTED ENABLE QUERY OPTIMIZATION
DB21034E  The command was processed as an SQL statement because it was not a 
valid Command Line Processor command.  During SQL processing it returned:
SQL0204N  "GDM_STG.ACCRL_DEFERL_CNTRL" is an undefined name.  SQLSTATE=42704
Cause
If the tables had a foreign key relationship and the parent/child relationship was not detected, then the child table might have been created on the target side first to cause the error.
Solution
When creating the replication set, select parent tables first. Do not select child tables. After the replication set is created, edit the replication set and select the child tables to create the table subscriptions.

The qrep-containers do not work for non-zen namespaces

Applies to: 4.6.2 and later

Problem
The qrep-containers are failing for namespaces other than zen.
Cause
The server.env variables DR_ENV and DR_ENV_DB2U are incorrectly being set to standalone because the QREP containers cannot detect Cloud Pak for Data deployments that do not use the default zen namespace.
Solution
For each QREP container, enter the following commands to manually update server.env variables DR_ENV and DR_ENV_DB2U to the correct environment type:
oc exec -it <qrep-container> bash
DR_ENV=DB2U-<database type>
sed -i --follow-symlinks '/DR_ENV/d' ${BLUDR_WLP_INSTALL_PATH}/wlp/usr/servers/bludr/server.env
echo "DR_ENV=$DR_ENV" >> ${BLUDR_WLP_INSTALL_PATH}/wlp/usr/servers/bludr/server.env
sed -i --follow-symlinks '/DR_ENV_DB2U/d' ${BLUDR_WLP_INSTALL_PATH}/wlp/usr/servers/bludr/server.env
echo "DR_ENV_DB2U=$DR_ENV" >> ${BLUDR_WLP_INSTALL_PATH}/wlp/usr/servers/bludr/server.env
sed -i --follow-symlinks '/LOCAL_DR_ENV_TYPE/d' ${BLUDR_WLP_INSTALL_PATH}/wlp/usr/servers/bludr/server.env
echo "LOCAL_DR_ENV_TYPE=$DR_ENV" >> ${BLUDR_WLP_INSTALL_PATH}/wlp/usr/servers/bludr/server.env
db2 connect to bludb user qrepdbadm using $(cat /secrets/qrepdbadmpwd/password )
db2 "update ASN.IBMQREP_RESTAPI_PROPERTIES SET PROP_VALUE_CHAR = 'DB2U-<database_type>' WHERE PROP_KEY='LOCAL_DR_ENV_TYPE'"
/opt/ibm/bludr/scripts/bin/bludr-restart.sh

where DB2U-<database_type> is your database type, for example, DB2U-DB2OLTP or DB2U-DB2WH.