Upgrading Data Virtualization from Version 3.5 to 4.0
A project administrator can upgrade Data Virtualization after upgrading IBM® Cloud Pak for Data from Version 3.5 to Version 4.0.x.
- Permissions you need for this task
- You must be an administrator of the OpenShift® project (Kubernetes namespace) where Data Virtualization is installed.
- Information you need to confirm before you start this task
- Before you upgrade Data Virtualization, confirm the following information:
- The name of the project where Data Virtualization is installed.
In Version 3.5, Data Virtualization is installed in the same project as Cloud Pak for Data.
- The storage class or classes that you are using for your existing Data Virtualization installation. The storage must be the same or equivalent to the storage classes listed in Information you need to confirm before you start this task.
Remember: In most cases, the name of the storage class is the same as the currently used volume (runoc -n project-name get pvcto identify it). If you are using Portworx storage, the storage class that is supported by Data Virtualization is changing in Cloud Pak for Data 4.0.x to portworx-db2-rwx-sc. If you are using Red Hat® OpenShift Container Storage, starting with Data Virtualization 1.7.6, the recommended storage class for OpenShift Container Storage isdv-engine-data-dv-engine-0ocs-storagecluster-ceph-rbd. - The name of the project where Data Virtualization is installed.
- Information you need to complete this task
-
- Data Virtualization requires a custom security context constraint (SCC). For details, see Creating required SCCs.
- Data Virtualization requires the Cloud Pak for Data common core services. If the common core services are not installed in the project or are not at the correct version, the common core services will be automatically installed when you upgrade Data Virtualization, which will increase the amount of time the upgrade takes to complete.
- Data Virtualization
uses the following storage classes. If you don't use these storage classes on your cluster, ensure
that you have a storage class with an equivalent definition:
- OpenShift Container
Storage:
ocs-storagecluster-ceph-rbd - IBM Spectrum®:
ibm-spectrum-scale-sc - NFS:
managed-nfs-storage - Portworx:
portworx-db2-rwx-sc - IBM Cloud File Storage:
ibmc-file-gold-gidoribm-file-custom-gold-gid
- OpenShift Container
Storage:
- Pre-upgrade tasks
-
- You must export users from Data Virtualization on Cloud Pak for Data 3.5.3 before you upgrade to Cloud Pak for Data 4.0.2. If you do not want to export and import existing users, you must create an empty /mnt/PV/versioned/dv_data/dv_instance_users.txt file.
- You must also copy any JAR files from JDBC drivers that you downloaded when you configured data source connections in Cloud Pak for Data 3.5.3. Copy the custom JAR files to /mnt/PV/versioned/private. For more information, see Exporting users and custom JARs before you upgrade Data Virtualization.
- If you installed Db2® Data Management Console Version 3.5, you must upgrade it to the same version as Data Virtualization. For more information, see Upgrading Db2 Data Management Console from the Version 3.5 release.
Before you begin
Ensure that the cluster meets the minimum requirements for Data Virtualization. For details, see System requirements.
Additionally, ensure that a cluster administrator completed the required Upgrade preparation tasks for your environment. Specifically, verify that a cluster administrator completed the following tasks:
- Cloud Pak for Data was upgraded. For details, see Upgrading Cloud Pak for Data.
- For environments that use a private container registry, such as air-gapped environments, the Data Virtualization software images are mirrored to the private container registry. For details, see Mirroring images to your container registry.
- The cluster is configured to pull the Data Virtualization software images. For details, see Configuring your cluster to pull images.
- The Data Virtualization catalog source exists. For details, see Creating catalog sources.
- The Data Virtualization operator subscription exists. For details, see Creating operator subscriptions.
- The security context constraints (SCCs) required to run Data Virtualization exist. For details, see Creating required SCCs.
- The node settings are adjusted for Data Virtualization. For details, see Changing required node settings.
If these tasks are not complete, the Data Virtualization upgrade will fail.
About this task
Upgrading Data Virtualization involves a fresh installation of Data Virtualization and then a custom migration of Data Virtualization data sources.
Prerequisite services
Before you upgrade Data Virtualization, ensure that the following services are upgraded and running:
- Db2U: If you have not already upgraded the
ibm-db2uoperator-catalog, create the Db2U catalog source. For more information, see Configuring your cluster to pull software images before upgrading from Version 3.5. Then, create the Db2U operator subscription. For more information, see Creating operator subscriptions before upgrading from Version 3.5. - Db2 Data Management Console: If you do not manually install Db2 Data Management Console, Data Virtualization installs it for you. If you have already installed Db2 Data Management Console, make sure that a Db2 Data Management Console instance has been upgraded and provisioned. For more information, see Upgrading Db2 Data Management Console.
- Cloud Pak for Data common core services: Data Virtualization installs Common core services on your Cloud Pak for Data cluster if you do not have it installed. Data Virtualization upgrades Common core services if you have it installed.
Procedure
Complete the following tasks to upgrade Data Virtualization:
Preparing for the upgrade
Prepare for the upgrade by backing up the Data Virtualization service and then deleting it.
- Log in to your OpenShift cluster as a project
administrator:
oc login OpenShift_URL:port - Change to the project where the Cloud Pak for Data control plane is
installed:
oc project project-name - Back up the addon and the service
provider:
oc get deployment dv-addon -o yaml > dv-addon-bak.yaml oc get deployment dv-service-provider -o yaml > dv-service-provider-bak.yaml oc get service dv-addon -o yaml > dv-addon-svc.yaml oc get service dv-service-provider -o yaml > dv-service-provider-svc.yaml - Delete the addon and service
provider:
oc delete deployment dv-addon oc delete deployment dv-service-provider oc delete service dv-addon oc delete service dv-service-provider
Upgrading the Data Virtualization service
- Follow the procedure for installing Data Virtualization in the upgraded IBM Cloud Pak for Data control plane.Important: Do not provision the Data Virtualization instance from the IBM Cloud Pak for Data user interface. Stop after the Data Virtualization service pods,
dv-addonanddv-service-provider, are up and running. - Get the status of Data Virtualization (dv-service):
oc get DvService dv-service -o jsonpath='{.status.conditions[?(@.type == "Successful")].status} {"\n"}'Data Virtualization has been upgraded successfully when the command returns
True.
Upgrading the Data Virtualization instance
To upgrade a Data Virtualization instance, do the following steps:
- Download and extract the Data Virtualization migration .tar file.
- In your browser, log in to the Cloud Pak for Data web
client.
By logging in to the web client, the download step does not require a login.
- Download the .tar file by copying the following URL in a
browser:
https://Cloud Pak for Data web client URL/icp4data-addon/dv/add-ons/upgrade.tgz - Copy the .tar file to your OpenShift infrastructure node.
- Extract the .tar
file:
tar -xzf upgrade.tgz
- In your browser, log in to the Cloud Pak for Data web
client.
- Update the values.template file in the templates
subdirectory from the extracted .tar file and change the following settings:
- headStorageSize
- This value must be a PersistentVolume capacity specification that matches the current setting.
You can check the current setting by running the following
command:
oc -n project-name get pvc dv-engine-data-dv-engine-0Where
project-nameis the OpenShift project where the Data Virtualization pods were created.Make sure to include the capacity unit (for example, Gi, after the numeric value).
- storageClassName
- This value must be the name of the storage class to use by the persistent volumes of the
upgraded instance. In most cases, the name of the storage class is the same as the
currently used volume (run
oc -n project-name get pvcto identify it). If you are using Portworx storage, the storage class that is supported by Data Virtualization is changing in Cloud Pak for Data 4.0.x to portworx-db2-rwx-sc. If you are using Red Hat OpenShift Container Storage, starting with Data Virtualization 1.7.6, the recommended storage class for OpenShift Container Storage isdv-engine-data-dv-engine-0ocs-storagecluster-ceph-rbd.
- Ensure that you have completed the pre-upgrade steps:
- You must export users from Data Virtualization on Cloud Pak for Data 3.5.3 before you upgrade to Cloud Pak for Data 4.0.2. If you do not want to export and import existing users, you must create an empty /mnt/PV/versioned/dv_data/dv_instance_users.txt file.
- You must also copy any JAR files from JDBC drivers that you downloaded when you configured data source connections in Cloud Pak for Data 3.5.3. Copy the custom JAR files to /mnt/PV/versioned/private. For more information, see Exporting users and custom JARs before you upgrade Data Virtualization.
- If you installed Db2 Data Management Console Version 3.5, you must upgrade it to the same version as Data Virtualization. For more information, see Upgrading Db2 Data Management Console from the Version 3.5 release.
- Run the upgrade script. This makes a backup of Data Virtualization that is used during the db2u
provisioning stage. For more information, see Provisioning the Data Virtualization service.
./dv-migration.sh - Provision Data Virtualization from the user interface. Data Virtualization automatically runs the upgrade as soon as you provision a Data Virtualization instance.
Verifying the upgrade
When you create the custom resource, the Data Virtualization operator processes the contents of the custom resource and updates the
microservices that comprise Data Virtualization, including DvService. (The DvService microservice is defined by the dv-service custom
resource.) Data Virtualization is upgraded when the DvService status is True.
To check the status of the upgrade:
- Change to the project where the Cloud Pak for Data control plane is
installed:
oc project project-name - Log in to the Data Virtualization head
pod.
oc rsh c-db2u-dv-db2u-0 bash - Verify that the following files exist:
- /mnt/bludata0/dv/versioned/marker_files/.upgraded
- db2uctl markers list, which contains the following
content.
(Db2u) QP_START_PERFORMED (Db2u) DV_CACHE_INITIALIZED
- Verify that the following files do not exist.
- /mnt/bludata0/dv/versioned/marker_files/.fgac_state
- /mnt/bludata0/dv/versioned/marker_files/.is_dv_upgrade
Upgrading remote connectors
You can upgrade remote connectors by using the UPDATEREMOTECONNECTOR stored
procedure. You can run this procedure by using the SQL editor or the Db2 console on the cluster.
- To update all remote connectors, run the following stored
procedure.
call dvsys.updateremoteconnector('',?,?) - If you need to upgrade a set of remote connectors, pass in a comma-separated
list.
call dvsys.updateremoteconnector('<REMOTE_CONNECTOR_NODES>',?,?)You can obtain the <REMOTE_CONNECTOR_NODES> by running the following command.
select node_name from dvsys.listnodes where AGENT_CLASS='R' - If you
notice that remote connectors do not appear in the user interface after the upgrade, run the
following stored procedure on the head pod.
CALL DVSYS.DEFINEGATEWAYS('<hostname>:<port>')Where <hostname> is the hostname of the remote connector and <port> is the port number used by the remote connector to connect to Data Virtualization. After you run this stored procedure, the remote connector appears in the user interface and when you run
dvsys.listnodes.See also Defining gateway configuration to access isolated remote connectors.
- To troubleshoot issues, see Updating remote connectors might fail with a Java™ exception after you upgrade Data Virtualization.
Rolling back the upgrade
Rolling back the upgrade is not supported when upgrading from the Version 3.5 release to a Version 4.0.x release.
What to do next
- Set the OwnerReference for any manually created PVCs. This step ensures
that the PVC is deleted when the instance is deleted in the future. This step is not required for
upgrade to work and can be done at any time before you delete the
instance.
oc patch pvc/bigsql-c-db2u-dv-db2u-0 -p '{"metadata":{"ownerReferences": [{"apiVersion":"db2u.databases.ibm.com/v1","kind":"Formation","name":"db2u-dv","uid":"'$(oc get db2u db2u-dv -o jsonpath='{.metadata.uid}')'"}]}}' --type=merge - Update the host and port of all connections to Data Virtualization to new connection values, which can be retrieved from the Connection Information page. You must apply these values to all external connections, connections in Notebooks, and connections in Cloud Pak for Data that weren't created with the Data Virtualization connection type.
- After Data Virtualization is upgraded, a new certificate is used for making SSL connections to Data Virtualization. You can download the new SSL certificate from the Data Virtualization instance page in the IBM Cloud Pak for Data web interface to update the configuration of applications connecting to Data Virtualization.
- Configure network requirements, including HAPROXY settings, if required. For more information, see Network requirements for Data Virtualization.
- Optionally, set up automatic pruning of the archive log.
Data Virtualization is ready to use. For more information, see Virtualizing data with Data Virtualization.