Upgrading Data Virtualization from Version 4.8 to Version 5.0
An instance administrator can upgrade Data Virtualization from Cloud Pak for Data Version 4.8 to Version 5.0.
- Who needs to complete this task?
-
Instance administrator To upgrade Data Virtualization, you must be an instance administrator. An instance administrator has permission to manage software in the following projects:
- The operators project for the instance
-
The operators for this instance of Data Virtualization are installed in the operators project. In the upgrade commands, the
${PROJECT_CPD_INST_OPERATORS}environment variable refers to the operators project. - The operands project for the instance
-
The custom resources for the control plane and Data Virtualization are installed in the operands project. In the upgrade commands, the
${PROJECT_CPD_INST_OPERANDS}environment variable refers to the operands project. - The tethered projects for the instance
- If any projects are tethered to the operands project, you have permission to manage the software in the tethered projects.
- When do you need to complete this task?
-
Review the following options to determine whether you need to complete this task:
- If you want to upgrade the Cloud Pak for Data control plane and one or more services at the same time, follow the process in Upgrading an instance of Cloud Pak for Data instead.
- If you didn't upgrade Data
Virtualization when you upgraded the Cloud Pak for Data control
plane, complete this task to upgrade Data
Virtualization.
Repeat as needed If you are responsible for multiple instances of Cloud Pak for Data, you can repeat this task to upgrade more instances of Data Virtualization on the cluster.
Information you need to complete this task
Review the following information before you upgrade Data Virtualization:
- Version requirements
-
All the components that are associated with an instance of Cloud Pak for Data must be installed at the same release. For example, if the Cloud Pak for Data control plane is at Version 5.0.3, you must upgrade Data Virtualization to Version 5.0.3.
- Environment variables
- The commands in this task use environment variables so that you can run the commands exactly as
written.
- If you do not have the script that defines the environment variables, see Setting up installation environment variables.
- To use the environment variables from the script, you must source the environment variables
before you run the commands in this task. For example,
run:
source ./cpd_vars.sh
- Common core services
- Data
Virtualization requires the Cloud Pak for Data
common core services.
If the common core services are not at the correct version in the operands project for the instance, the common core services are automatically upgraded when you upgrade Data Virtualization. The common core services upgrade increases the amount of time the upgrade takes to complete.
- Storage requirements
- You don't need to specify storage when you upgrade Data Virtualization.
Before you begin
- If you have Databricks connections prior to upgrade, then you must complete the Databricks pre-upgrade steps before you upgrade Data Virtualization. See Databricks pre-upgrade steps.
- If you are upgrading Data Virtualization from a version older than Cloud Pak for Data 4.8.2 to a version newer than 4.8.2, you must temporarily update the Duplicate asset handling settings of your catalogs to Allow duplicates. After the upgrade, you can revert the Duplicate asset handling configuration. For steps on updating this setting, see Changing catalog settings.
-
- Data Virtualization version numbers and upgrade paths
- Verify the Data Virtualization version numbers that you are upgrading from and to. See Supported upgrade paths in Data Virtualization.
This task assumes that the following prerequisites are met:
| Prerequisite | Where to find more information |
|---|---|
| The cluster meets the minimum requirements for Data Virtualization. | If this task is not complete, see System requirements. |
The workstation from which you will run the upgrade is set up as a client workstation and
has the following command-line interfaces:
|
If this task is not complete, see Updating client workstations. |
| The Cloud Pak for Data control plane is upgraded. | If this task is not complete, see Upgrading an instance of Cloud Pak for Data. |
| For environments that use a private container registry, such as air-gapped environments, the Data Virtualization software images are mirrored to the private container registry. | If this task is not complete, see Mirroring images to a private container registry. |
For environments that use a private container registry, such as air-gapped environments,
the cpd-cli is configured to pull the olm-utils-v3 image from the private container registry. |
If this task is not complete, see Pulling the olm-utils-v3 image from the private container registry. |
Prerequisite services
Before you upgrade Data Virtualization, ensure that the following services are upgraded and running:
- Db2 Data Management Console: If you do not manually upgrade Db2 Data Management Console, Data Virtualization upgrades it for you. If you have already upgraded Db2 Data Management Console, make sure that a Db2 Data Management Console instance has been provisioned. For more information, see Upgrading Db2 Data Management Console.
Procedure
Complete the following tasks to upgrade Data Virtualization:
Upgrading the service
cpd-cli
manage
apply-olm updates all of the OLM objects in the operators project
at the same time.To upgrade Data Virtualization:
-
Log the
cpd-cliin to the Red Hat OpenShift Container Platform cluster:${CPDM_OC_LOGIN}Remember:CPDM_OC_LOGINis an alias for thecpd-cli manage login-to-ocpcommand. - Update the custom resource for Data
Virtualization.
cpd-cli manage apply-cr \ --components=dv \ --release=${VERSION} \ --cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS} \ --license_acceptance=true \ --upgrade=true
Validating the upgrade
apply-cr command
returns:[SUCCESS]... The apply-cr command ran successfully
If you want to confirm that the custom resource status is
Completed, you can run the cpd-cli
manage
get-cr-status command:
cpd-cli manage get-cr-status \
--cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS} \
--components=dv
Upgrading existing service instances
CrashLoop state. This is expected behavior because Big SQL stops at the beginning of the upgrade process.
The caching pod remains in this state until the Data
Virtualization pods restart and load the new
docker images. If you suspect that the Data
Virtualization upgrade has stalled, then check the Data
Virtualization head pod logs.
- Shut down the Data Virtualization pods
- Manually start or stop Big SQL or Db2
After you upgrade Data Virtualization, you must upgrade any service instances that are associated with Data Virtualization.
- Before you begin
-
Create a profile on the workstation from which you will upgrade the service instances.
The profile must be associated with a Cloud Pak for Data user who has either the following permissions:
- Create service instances (
can_provision) - Manage service instances (
manage_service_instances)
For more information, see Creating a profile to use the cpd-cli management commands.
- Create service instances (
CrashLoop state. This is expected behavior because
Big SQL stops at the beginning of the upgrade
process. The caching pod remains in this state until the Data
Virtualization pods restart and load
the new docker images. If you suspect that the Data
Virtualization upgrade has stalled, then check
the Data
Virtualization head pod logs. - Shut down the Data Virtualization pods
- Manually start or stop Big SQL or Db2
-
Log the
cpd-cliin to the Red Hat OpenShift Container Platform cluster:${CPDM_OC_LOGIN}Remember:CPDM_OC_LOGINis an alias for thecpd-cli manage login-to-ocpcommand. - Change to the project where Data
Virtualization pods are
installed.
oc project ${PROJECT_CPD_INST_OPERANDS} - Get the list of Data
Virtualization service
instances:
cpd-cli service-instance list \ --service-type=dv \ --profile=${CPD_PROFILE_NAME} - Upgrade one instance at a time. For each instance that you want to upgrade, complete the
following steps:
- Set the
WQ_INSTANCE_NAMEenvironment variable to the name of the service instance that you want to upgrade:export WQ_INSTANCE_NAME=<instance-name> - Run the following command to upgrade the
instance:
cpd-cli service-instance upgrade \ --instance-name=${WQ_INSTANCE_NAME} \ --service-type=dv \ --profile=${CPD_PROFILE_NAME} - Run one of the following commands to verify that the version now reads 3.0.3 for the upgraded instances:
-
cpd-cli service-instance list --service-type=dv --profile=${CPD_PROFILE_NAME} -
oc get bigsql db2u-dv -o jsonpath='{.status.version}{"\n"}'
-
- Set the
- Wait until the instance upgrades are complete before you proceed to the next step.
Upgrading remote connectors
If you installed remote connectors, you can upgrade the remote connectors by using the
UPDATEREMOTECONNECTOR stored procedure. You run this procedure by using the SQL
editor or the Db2 console on the cluster.
- To update all remote connectors, run the following stored
procedure.
call dvsys.updateremoteconnector('',?,?) - If you need to upgrade a set of remote connectors, pass in a comma-separated list.
call dvsys.updateremoteconnector('<REMOTE_CONNECTOR_NODES>',?,?)- In the datavirtualization.env file, change the export
JAVA_HOME file path to the Java 21 JRE file
path.
export JAVA_HOME=<Java 21 JRE file path> - Start the agent by running this Linux®
command:
nohup ./datavirtualization_start.sh &
- In the datavirtualization.env file, change the export
JAVA_HOME file path to the Java 21 JRE file
path.
-
You can obtain the <REMOTE_CONNECTOR_NODES> by running the following command.
select node_name from dvsys.listnodes where AGENT_CLASS='R'
What to do next
-
After you upgrade to Data Virtualization on Cloud Pak for Data version 5.0.3, you must manually edit data sources that use SSL connections. Otherwise, the data sources are invalid and your queries on the data will fail. Complete the following steps to edit the SSL-enabled data sources so that they are valid for use:
- On the Data Virtualization Data sources page, find the SSL-enabled data sources that show a status of Invalid.
- Complete the following steps to edit each invalid data source:
- Select Edit connection at the end of the data source row.
- Make a minor change to the name or description, but don’t change any other values.
- Save the change to trigger an update to the data source.
-
After you upgrade, all active or inactive caches with refresh schedules are reset. You must edit the active caches and set the refresh rate again. For more information, see Adding data caches in Data Virtualization.
Data Virtualization is ready to use. For more information, see Virtualizing data.