Upgrading Data Virtualization from Version 4.0 to Version 4.5

Important: IBM® Cloud Pak for Data Version 4.5 will reach end of support (EOS) on 31 July, 2025. For more information, see the Discontinuance of service announcement for IBM Cloud Pak for Data Version 4.X.

Upgrade to IBM Software Hub Version 5.1 before IBM Cloud Pak for Data Version 4.5 reaches end of support. For more information, see Upgrading IBM Software Hub in the IBM Software Hub Version 5.1 documentation.

A project administrator can upgrade Data Virtualization from Version 4.0 to Version 4.5.

Important: To complete this task, you must be running Data Virtualization Version 1.7.2 or later. (Version 1.7.2 was released with Refresh 2 of Cloud Pak for Data Version 4.0.)
What permissions do you need to complete this task?
The permissions that you need depend on which tasks you must complete:
  • To update the Data Virtualization operators, you must have the appropriate permissions to create operators and you must be an administrator of the project where the Cloud Pak for Data operators are installed. This project is identified by the ${PROJECT_CPD_OPS} environment variable.
  • To upgrade Data Virtualization, you must be an administrator of the project where Data Virtualization is installed. This project is identified by the ${PROJECT_CPD_INSTANCE} environment variable.
When do you need to complete this task?
If you didn't upgrade Data Virtualization when you upgraded the platform, you can complete this task to upgrade your existing Data Virtualization installation.

If you want to upgrade all of the Cloud Pak for Data components at the same time, follow the process in Upgrading the platform and services instead.

Important: All of the Cloud Pak for Data components in a deployment must be installed at the same release.

Information you need to complete this task

Review the following information before you upgrade Data Virtualization:

Environment variables
The commands in this task use environment variables so that you can run the commands exactly as written.
  • If you don't have the script that defines the environment variables, see Setting up installation environment variables.
  • To use the environment variables from the script, you must source the environment variables before you run the commands in this task, for example:
    source ./cpd_vars.sh
Installation location
Data Virtualization is installed in the same project (namespace) as the Cloud Pak for Data control plane. This project is identified by the ${PROJECT_CPD_INSTANCE} environment variable.
Common core services
Data Virtualization requires the Cloud Pak for Data common core services.

If the common core services are not at the required version for the release, the common core services will be automatically upgraded when you upgrade Data Virtualization. This increases the amount of time the upgrade takes to complete.

Storage requirements
You don't need to specify storage when you upgrade Data Virtualization.

Before you begin

This task assumes that the following prerequisites are met:

Prerequisite Where to find more information
The cluster meets the minimum requirements for Data Virtualization. If this task is not complete, see System requirements.
The workstation from which you will run the upgrade is set up as a client workstation and includes the following command-line interfaces:
  • Cloud Pak for Data CLI: cpd-cli
  • OpenShift® CLI: oc
If this task is not complete, see Setting up a client workstation.
The Cloud Pak for Data control plane is upgraded. If this task is not complete, see Upgrading the platform and services.
For environments that use a private container registry, such as air-gapped environments, the Data Virtualization software images are mirrored to the private container registry. If this task is not complete, see Mirroring images to a private container registry.

Prerequisite services

Before you upgrade Data Virtualization, ensure that the following services are upgraded and running:

  • Db2® Data Management Console: Make sure that a Db2 Data Management Console instance has been upgraded and provisioned. For more information, see Upgrading Db2 Data Management Console.
  • Cloud Pak for Data common core services: Data Virtualization installs Common core services on your Cloud Pak for Data cluster if you do not have it installed. Data Virtualization upgrades Common core services if you have it installed.

Procedure

Complete the following tasks to upgrade Data Virtualization:

  1. Logging in to the cluster.
  2. Updating the operator.
  3. Upgrading the service.
  4. Validating the upgrade.
  5. Completing additional steps in specialized installations.
  6. Upgrading the service instance.
  7. Upgrading remote connectors.
  8. What to do next.

Logging in to the cluster

To run cpd-cli manage commands, you must log in to the cluster.

To log in to the cluster:

  1. Run the cpd-cli manage login-to-ocp command to log in to the cluster as a user with sufficient permissions to complete this task. For example:
    cpd-cli manage login-to-ocp \
    --username=${OCP_USERNAME} \
    --password=${OCP_PASSWORD} \
    --server=${OCP_URL}
    Tip: The login-to-ocp command takes the same input as the oc login command. Run oc login --help for details.

Updating the operator

The Data Virtualization operator simplifies the process of managing the Data Virtualization service on Red Hat® OpenShift Container Platform.

To upgrade Data Virtualization, ensure that all of the Operator Lifecycle Manager (OLM) objects in the ${PROJECT_CPD_OPS} project, such as the catalog sources and subscriptions, are upgraded to the appropriate release. All of the OLM objects must be at the same release.

Who needs to complete this task?
You must be a cluster administrator (or a user with the appropriate permissions to install operators) to create the OLM objects.
When do you need to complete this task?
Complete this task only if the OLM artifacts have not been updated for the current release using the cpd-cli manage apply-olm command with the --upgrade=true option.

It is not necessary to run this command multiple times for each service that you plan to upgrade. If you complete this task and the OLM artifacts already exist on the cluster, the cpd-cli will recreate the OLM objects for all of the existing components in the ${PROJECT_CPD_OPS} project.

To update the operator:

  1. Update the OLM objects:
    cpd-cli manage apply-olm \
    --release=${VERSION} \
    --cpd_operator_ns=${PROJECT_CPD_OPS} \
    --upgrade=true
    • If the command succeeds, it returns [SUCCESS]... The apply-olm command ran successfully.
    • If the command fails, it returns [ERROR] and includes information about the cause of the failure.

What to do next: Upgrade the Data Virtualization service.

Upgrading the service

After the Data Virtualization operator is updated, you can upgrade Data Virtualization.

Who needs to complete this task?
You must be an administrator of the project where Data Virtualization is installed.
When do you need to complete this task?
Complete this task for each instance of Data Virtualization that is associated with an instance of Cloud Pak for Data Version 4.5.

To upgrade the service:

  1. Update the custom resource for Data Virtualization.
    cpd-cli manage apply-cr \
    --components=dv \
    --release=${VERSION} \
    --cpd_instance_ns=${PROJECT_CPD_INSTANCE} \
    --license_acceptance=true \
    --upgrade=true
  2. 4.5.2This step is required only if you are upgrading to IBM Cloud Pak for Data 4.5.2 or earlier.

    Restart the zen-watcher pod to ensure that the DvService metadata updates are processed.

    oc delete pod -n ${PROJECT_CPD_INSTANCE} -l component=zen-watcher

Validating the upgrade

Data Virtualization is upgraded when the apply-cr command returns [SUCCESS]... The apply-cr command ran successfully.

However, you can optionally run the cpd-cli manage get-cr-status command if you want to confirm that the custom resource status is Completed:

cpd-cli manage get-cr-status \
--cpd_instance_ns=${PROJECT_CPD_INSTANCE} \
--components=dv

Completing additional steps in specialized installations

The following additional steps are required when you upgrade a specialized 4.0.x installation of Data Virtualization.

4.5.0 These steps are required only if you are upgrading to IBM Cloud Pak for Data 4.5.0.

In Cloud Pak for Data 4.5, the new Db2U operator now resides in the ibm-common-services (${PROJECT_CPFS_OPS} environment variable) project (namespace) for specialized installations. As a result, you must confirm that the old Db2U operator no longer exists in the Cloud Pak for Data operators project (${PROJECT_CPD_OPS} environment variable).

You do not need to do the following steps for express installations, where all operators exist in the ibm-common-services project.

  1. Run the following command to confirm that the db2u-operator is running in the ibm-common-services namespace.
    oc -n ${PROJECT_CPFS_OPS} get po | grep db2u
  2. Check whether the old v1 db2u-operator is still present in the Cloud Pak for Data operators project.
    oc -n ${PROJECT_CPD_OPS} get csv | grep db2u-operator.v1
    oc -n ${PROJECT_CPD_OPS} get po | grep db2u-operator
    oc -n ${PROJECT_CPD_OPS} get sub | grep db2u-operator
  3. If any of these old resources are present, run the following commands to remove them.
    oc get ClusterServiceVersion -n ${PROJECT_CPD_OPS} | grep db2u-operator.v1
    ## Note the CSV name returned and then issue a delete on this CSV resource:
    oc -n ${PROJECT_CPD_OPS} delete clusterserviceversion <CSV name>
    oc -n ${PROJECT_CPD_OPS} delete Subscription ibm-db2u-operator
    Important: Make sure that you delete the old v1 db2u-operator and not the new one.
  4. Run the following command to restart the new db2u-operator.
    oc -n ${PROJECT_CPD_OPS} delete po -l control-plane=db2u-operator-manager
  5. To troubleshoot issues, see Db2u-operator pod errors when you upgrade from 3.5.x to 4.5.0 or from 4.0.x to 4.5.0.

Upgrading the service instance

After you upgrade Data Virtualization, the service instance that is associated with the installation must also be upgraded. This task must be completed by a Cloud Pak for Data administrator or a service instance administrator.

  1. Change to the project where Data Virtualization pods are installed.
    oc project ${PROJECT_CPD_INSTANCE}
  2. Refresh the SSL certification if it has expired. For more information, see Refreshing the SSL certificate used by Data Virtualization after the Cloud Pak for Data self-signed certificate is updated.
  3. Check whether archive logs and Db2 diagnostic logs are filling up the Data Virtualization head pod persistent volume. Delete old archive logs if necessary. For more information, see Persistent volume on Data Virtualization head pod becomes full.

    Run following commands to determine whether Db2 diagnostic logs are filling up the persistent volume and delete old logs as needed.

    1. oc -n ${PROJECT_CPD_INSTANCE} rsh c-db2u-dv-db2u-0 bash
    2. su - db2inst1
    3. du -hs /mnt/blumeta0/db2/databases/db2inst1/NODE0000
    4. du -hs $DIAGPATH/NODE0000

    If the result of step 3.c indicates that the NODE0000 directory is large, review its contents to check whether there are old core dump files that you can delete. The /mnt directory must have more free space than the size of the database that is shown in step 3.c.

  4. Upgrade the instance by updating its version field:
    oc patch bigsql db2u-dv --patch '{"spec": {"version": "1.8.3"}}' --type=merge

    This step triggers the instance upgrade, which will take some time to complete. First, the upgrade runs a backup. If this backup fails, the version in .spec.version is reset to the previous version. This process can be monitored in the logs of the Db2U operator manager pod. The value in .status.version is not updated to match .spec.version until the upgrade process has completed successfully.

  5. Run the following command to verify that the version now reads 1.8.3.
    oc get bigsql db2u-dv -o jsonpath='{.status.version}{"\n"}'

    You must wait until the instance upgrade completes before you proceed to the next step.

  6. 4.5.1 This step is required only if you are upgrading to IBM Cloud Pak for Data 4.5.1. Set the Db2 registry variable DB2_EXTENDED_OPTIMIZATION to BI_INFER_CC ON,NICKNAME_PARALLEL -100 and restart Big SQL.
    oc rsh c-db2u-dv-db2u-0 bash
    su - db2inst1
    db2set -im DB2_EXTENDED_OPTIMIZATION="BI_INFER_CC ON,NICKNAME_PARALLEL -100"
    bigsql stop
    bigsql start
  7. 4.5.2 These steps are required only if you are upgrading to IBM Cloud Pak for Data 4.5.2 or earlier. Update the Data Virtualization caching pods.
    1. Edit the c-db2u-dv-dvcaching pod.
      oc edit deployment c-db2u-dv-dvcaching
    2. In the initContainers/env section, search for the following entries, remove the keys and corresponding values, and then save and exit.
      • - name: DB2U_API_KEY_FILE
      • - name: DB2U_API_CERT_FILE
    3. Wait for the c-db2u-dv-dvcaching pod to restart. The pod changes from 0/1 Init to 0/1 Running and eventually to 1/1 Running.

Upgrading remote connectors

You can upgrade remote connectors by using the UPDATEREMOTECONNECTOR stored procedure. You can run this procedure by using the SQL editor or the Db2 console on the cluster.

  • To update all remote connectors, run the following stored procedure.
    call dvsys.updateremoteconnector('',?,?)
  • If you need to upgrade a set of remote connectors, pass in a comma-separated list.
    call dvsys.updateremoteconnector('<REMOTE_CONNECTOR_NODES>',?,?)

    You can obtain the <REMOTE_CONNECTOR_NODES> by running the following command.

    select node_name from dvsys.listnodes where AGENT_CLASS='R'
  • If you notice that remote connectors do not appear in the user interface after the upgrade, run the following stored procedure on the head pod.
    CALL DVSYS.DEFINEGATEWAYS('<hostname>:<port>')

    Where <hostname> is the hostname of the remote connector and <port> is the port number used by the remote connector to connect to Data Virtualization. After you run this stored procedure, the remote connector appears in the user interface and when you run dvsys.listnodes.

    See also Defining gateway configuration to access isolated remote connectors.

  • To troubleshoot issues, see Updating remote connectors might fail with a Java™ exception after you upgrade Data Virtualization.

What to do next

  1. Run the following steps from the Data Virtualization head pod to manually remove unnecessary files.
    1. Log in to the Data Virtualization head pod.
      oc rsh c-db2u-dv-db2u-0 bash
    2. Switch to the db2inst1 user.
      su - db2inst1
    3. Remove unnecessary JAR files.
      rm -rf 
      /mnt/blumeta0/home/db2inst1/sqllib/datavirtualization/dvm_driver/log4j-core-2.8.2.jar
      /mnt/blumeta0/home/db2inst1/sqllib/datavirtualization/dvm_driver/log4j-api-2.8.2.jar
      /mnt/bludata0/dv/versioned/pre_migration/sqllib/datavirtualization/dvm_driver/log4j-api-2.8.2.jar
      /mnt/bludata0/dv/versioned/pre_migration/sqllib/datavirtualization/dvm_driver/log4j-core-2.8.2.jar
    4. Run the following script to remove the log4j-core-2.8.2.jar file.
      ${BIGSQL_CLI_DIR}/BIGSQL/package/scripts/bigsqlPexec.sh -w -c "rm -rf /mnt/blumeta0/home/db2inst1/sqllib/datavirtualization/dvm_driver/log4j-core-2.8.2.jar"
    5. Run the following script to remove the log4j-api-2.8.2.jar file.
      ${BIGSQL_CLI_DIR}/BIGSQL/package/scripts/bigsqlPexec.sh -w -c "rm -rf /mnt/blumeta0/home/db2inst1/sqllib/datavirtualization/dvm_driver/log4j-api-2.8.2.jar"
    6. Remove unnecessary .zip and .tar files.
      rm -rf 
      /mnt/PV/versioned/uc_dsserver_shared/config/DATAVIRTUALIZATION_ENDPOINT_V1.7*.tar.gz
      /mnt/PV/versioned/uc_dsserver_shared/config/DATAVIRTUALIZATION_ENDPOINT_V1.7*.zip
    7. Copy the .tar file for this version.
      cp /opt/ibm/qp_artifacts/archives/DATAVIRTUALIZATION_ENDPOINT_V1.8.3_*.tar.gz /mnt/PV/versioned/uc_dsserver_shared/config
    8. Copy the .zip file for this version.
      cp /opt/ibm/qp_artifacts/archives/DATAVIRTUALIZATION_ENDPOINT_V1.8.3_*.zip /mnt/PV/versioned/uc_dsserver_shared/config
  2. Complete the following steps to restart head and worker pods.
    1. Wait for the Data Virtualization hurricane pod to start successfully.
    2. Run the following commands to restart the Data Virtualization head and worker pods:
      current_replicas=$(oc -n zen get sts c-db2u-dv-db2u -o jsonpath="{.spec.replicas}")
      oc -n zen scale sts c-db2u-dv-db2u --replicas=0; sleep 3m; oc -n zen scale sts c-db2u-dv-db2u --replicas=$current_replicas
  3. 4.5.0 This step is required only if you are upgrading to IBM Cloud Pak for Data 4.5.0. Run one of the following options to ensure that tables and schemas appear correctly.
    • Run the following commands on the head pod.
      oc exec -ti c-db2u-dv-db2u-0 -- bash
      su db2inst1
      db2 connect to bigsql
      db2 "alter server QPLEX options ( add  varchar_no_trailing_blanks 'Y');"
    • Run the following command from the SQL editor by navigating to Run SQL on the service menu.
      alter server QPLEX options ( add  varchar_no_trailing_blanks 'Y');

Data Virtualization is ready to use. For more information, see Virtualizing data with Data Virtualization.