Upgrading Data Virtualization from Version 5.2 to Version 5.4

An instance administrator can upgrade Data Virtualization from Version 5.2 to Version 5.4.

Attention: If your existing Data Virtualization instance uses custom sizing, then upgrading your Data Virtualization instance automatically adds five agents, each requiring two CPUs. The increased resource usage is typically balanced if your custom cluster was deployed with sufficient resources to accommodate the extra load without dropping below a stable minimum. However, if you have custom sizing and limited resources, then you might experience a net increase in resource usage. To adjust the number of resources your Data Virtualization instance uses, see: Customizing the pod size and resource usage of Data Virtualization agents.
Who needs to complete this task?

Instance administrator To upgrade Data Virtualization, you must be an instance administrator. An instance administrator has permission to manage software in the following projects:

The operators project for the instance

The operators for this instance of Data Virtualization are installed in the operators project. In the upgrade commands, the ${PROJECT_CPD_INST_OPERATORS} environment variable refers to the operators project.

The operands project for the instance

The custom resources for the control plane and Data Virtualization are installed in the operands project. In the upgrade commands, the ${PROJECT_CPD_INST_OPERANDS} environment variable refers to the operands project.

The tethered projects for the instance
If any projects are tethered to the operands project, you have permission to manage the software in the tethered projects.
When do you need to complete this task?

Review the following options to determine whether you need to complete this task:

  • If you want to upgrade the IBM Software Hub control plane and one or more services at the same time, follow the process in Upgrading an instance of IBM Software Hub instead.
  • If you didn't upgrade Data Virtualization when you upgraded the IBM Software Hub control plane, complete this task to upgrade Data Virtualization.

    Repeat as needed If you are responsible for multiple instances of IBM Software Hub, you can repeat this task to upgrade more instances of Data Virtualization on the cluster.

Information you need to complete this task

Review the following information before you upgrade Data Virtualization:

Version requirements

All the components that are associated with an instance of IBM Software Hub must be installed at the same release. For example, if the IBM Software Hub control plane is at Version 5.4.0, you must upgrade Data Virtualization to Version 5.4.0.

Environment variables
The commands in this task use environment variables so that you can run the commands exactly as written.
  • If you don't have the script that defines the environment variables, see Setting up installation environment variables.
  • To use the environment variables from the script, you must source the environment variables before you run the commands in this task. For example, run:
    source ./cpd_vars.sh
Common core services
Data Virtualization requires the IBM Software Hub common core services.

If the common core services are not at the correct version in the operands project for the instance, the common core services are automatically upgraded when you upgrade Data Virtualization. The common core services upgrade increases the amount of time the upgrade takes to complete.

Before you begin

This task assumes that the following prerequisites are met:

System requirements
This task assumes that the cluster meets the minimum requirements for Data Virtualization.
Where to find more information
If this task is not complete, see System requirements.
Workstation
This task assumes that the workstation from which you will run the upgrade is set up as a client workstation and has the following command-line interfaces:
  • IBM Software Hub CLI: cpd-cli
  • OpenShift® CLI: oc
  • Helm CLI: helm
Where to find more information
If this task is not complete, see Updating client workstations.
Control plane
This task assumes that the IBM Software Hub control plane is upgraded.
Where to find more information
If this task is not complete, see Upgrading an instance of IBM Software Hub.
Private container registry
If your environment uses a private container registry (for example, your cluster is air-gapped), this task assumes that the following tasks are complete:
  1. The Data Virtualization software images are mirrored to the private container registry.
    Where to find more information
    If this task is not complete, see Mirroring images to a private container registry.
  2. The cpd-cli is configured to pull the olm-utils-v4 image from the private container registry.
    Where to find more information
    If this task is not complete, see Pulling the olm-utils-v4 image from the private container registry.
Cluster-scoped resources
This task assumes that the cluster-scoped resources, such as custom resource definitions, cluster roles, and cluster role bindings, were updated.
Where to find more information
If this task is not complete, see Updating the cluster-scoped resources for the platform and services.
Image pull secrets
This task assumes that the secrets that contain the image pull credentials for the instance exist.
Where to find more information
If this task is not complete, see Creating image pull secrets for an instance of IBM Software Hub.

Prerequisite services

Before you upgrade Data Virtualization, ensure that the following services are upgraded and running:

  • Data Virtualization has a dependency on the Db2 Data Management Console:
    • If you installed IBM Software Hub by using the cpd-cli, and you did not manually install the Db2 Data Management Console, then Data Virtualization installs the service for you.
    • If you installed IBM Software Hub by using Argo CD, you must ensure that the Db2 Data Management Console is also installed by applying the Db2 Data Management Console YAML file. Data Virtualization will not install the Db2 Data Management Console for you.
    • If you have already installed the Db2 Data Management Console, ensure that a Db2 Data Management Console instance has been provisioned. For more information, see Installing Db2 Data Management Console.

Procedure

Complete the following tasks to upgrade Data Virtualization:

  1. Upgrading the service
  2. Validating the upgrade
  3. Upgrading existing service instances
  4. Upgrade any remote connectors that are installed
  5. What to do next

Upgrading the service

To upgrade Data Virtualization:

  1. Log the cpd-cli in to the Red Hat® OpenShift Container Platform cluster:
    ${CPDM_OC_LOGIN}
    Remember: CPDM_OC_LOGIN is an alias for the cpd-cli manage login-to-ocp command.
  2. Update the operator and custom resource for Data Virtualization.
    cpd-cli manage install-components \
    --license_acceptance=true \
    --components=dv \
    --release=${VERSION} \
    --patch_id=${PATCH_ID} \
    --operator_ns=${PROJECT_CPD_INST_OPERATORS} \
    --instance_ns=${PROJECT_CPD_INST_OPERANDS} \
    --image_pull_prefix=${IMAGE_PULL_PREFIX} \
    --image_pull_secret=${IMAGE_PULL_SECRET} \
    --upgrade=true

Validating the upgrade

Data Virtualization is upgraded when the install-components command returns:
[SUCCESS]... The install-components command ran successfully

If you want to confirm that the custom resource status is Completed, you can run the cpd-cli manage get-cr-status command:

cpd-cli manage get-cr-status \
--cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS} \
--components=dv

Upgrading existing service instances

After you upgrade Data Virtualization, you must upgrade any service instances that are associated with Data Virtualization.

Before you begin

Create a profile on the workstation from which you will upgrade the service instances.

The profile must be associated with a IBM Software Hub user who has either the following permissions:

  • Create service instances (can_provision)
  • Manage service instances (manage_service_instances)

For more information, see Creating a profile to use the cpd-cli management commands.

Note: During the Data Virtualization instance upgrade, the caching pod initially enters a CrashLoop state. This is expected behavior because Big SQL stops at the beginning of the upgrade process. The caching pod remains in this state until the Data Virtualization pods restart and load the new docker images. If you suspect that the Data Virtualization upgrade has stalled, then check the Data Virtualization head pod logs.
Do not do the following without consulting IBM Support:
  • Shut down the Data Virtualization pods.
  • Manually start or stop Big SQL or Db2.

After the Data Virtualization pods restart with the updated docker images, the caching pod switches to a 0/1 Init state. It stays in this state until the Data Virtualization head pod completes the upgrade successfully.

Procedure

To upgrade the service instances:

  1. Get the list of Data Virtualization service instances:
    cpd-cli service-instance list \
    --service-type=dv \
    --profile=${CPD_PROFILE_NAME}
  2. Set the INSTANCE_NAME environment variable to the name of the service instance that you want to upgrade:
    export INSTANCE_NAME=<instance-name>
  3. Set the INSTANCE_VERSION environment variable to the version that corresponds to the version of IBM Software Hub on your cluster:
    export INSTANCE_VERSION=<version>
    Use the following table to determine the appropriate value:
    IBM Software Hub version Service instance version
    5.4.0 3.4.0
  4. Upgrade the service instance:
    cpd-cli service-instance upgrade \
    --service-type=dv \
    --instance-name=${INSTANCE_NAME} \
    --profile=${CPD_PROFILE_NAME} \
    --version=${INSTANCE_VERSION}
  5. Run these commands to prevent upgrade errors caused by aborted FMP processes:
    Important: Ensure you repeat this step for each Data Virtualization instance.
    1. Log into the Data Virtualization head pod as the db2inst1 user:
      oc -n <DV_INSTANCE_NAMESPACE> rsh c-db2u-dv-db2u-0 -- su - db2inst1
    2. Reinitialize the SYSHADOOP applications:
      db2 "DROP VIEW SYSHADOOP.APPLICATIONS"
      db2 "CALL SYSINSTALLOBJECTS('BIGSQL', 'C', null, null)"
      bigsql stop
      bigsql start

      Calling the SYSHADOOP view might cause a SQL511N error, which you can ignore because it is related to a table that does not need to be updated.

      Example output:
      [db2inst1@c-db2u-dv-db2u-0 - Db2U ~]$ db2 "DROP VIEW SYSHADOOP.APPLICATIONS"
      DB20000I  The SQL command completed successfully.
      
      [db2inst1@c-db2u-dv-db2u-0 - Db2U ~]$ db2 "CALL SYSINSTALLOBJECTS('BIGSQL', 'C', null, null)"
      SQL5115N  The command or statement was not executed because the following 
      functionality is not supported in the current environment: "Table cannot be 
      created in this tablespace".  SQLSTATE=56038
  6. Repeat the preceding steps to upgrade each service instance associated with this instance of IBM Software Hub.
    Important: Each of your custom-sized Data Virtualization instances will now have five Data Virtualization agent pods, each requiring two CPUs. To adjust agent pod resource usage, see Customizing the pod size and resource usage of Data Virtualization agents.

Upgrading remote connectors

After you upgrade to IBM Cloud Pak® for Data version 5.4.0, all of the remote connectors associated with the Data Virtualization instance does not show up in listnodes when you run select node_name from dvsys.listnodes. For more information, see listnodes does not display remote connectors after upgrade from version 5.2.0 to 5.3.0.

To workaround this issue, upgrade your remote connectors manually by completing these steps:

  1. On the remote connector machine, run the following command from the sysroot sub-directory inside of your remote connector directory:
    killGaianServers.sh
  2. Create a new remote connector install script from the IBM Cloud Pak for Data console. Use a temporary directory on the Remote Connector machine as your target install directory for this step.
  3. Run the newly created remote connector install script. This script downloads the latest remote connector release and installs it in your chosen temporary directory.
  4. Promptly stop the new remote connector:
    Linux
    On Linux, run the killGaianServer.sh script.
    Windows
    On Windows, run the uninstall_service.bat file.
  5. Copy the contents of sysroot/lib in your new temporary directory to your original remote connector directory sysroot/lib to update the old remote connector jars.
  6. Start your original remote connector:
    Linux
    On Linux, run this command:
    nohup datavirtualization_start.sh
    Windows
    On Windows, start the DataVirtualizationService service. For example: DataVirtualizationService6414.

What to do next

  1. After you upgrade to Data Virtualization on IBM Software Hub version 5.4.0, you must manually edit data sources that use SSL connections. Otherwise, the data sources are invalid and your queries on the data will fail. Complete the following steps to edit the SSL-enabled data sources so that they are valid for use:
    1. On the Data Virtualization Data sources page, find the SSL-enabled data sources that show a status of Invalid.
    2. Complete the following steps to edit each invalid data source:
      1. Select Edit connection at the end of the data source row.
      2. Make a minor change to the name or description, but don’t change any other values.
      3. Save the change to trigger an update to the data source.
      The data sources that you edited are now valid and you can proceed to use them in your queries.
  2. After you upgrade, all active or inactive caches with refresh schedules are reset. You must edit the active caches and set the refresh rate again. For more information, see Adding data caches in Data Virtualization.

  3. Confirm that the audit functionality updated correctly, or apply the workaround if it is not updated. See Audit functionality might not be updated after you upgrade to Data Virtualization version 5.4.0.