Upgrading Db2 Big SQL from Version 3.5 to Version 4.6

A project administrator can upgrade Db2 Big SQL from Cloud Pak for Data Version 3.5 to Version 4.6.0, 4.6.1, or 4.6.2.

Important: To complete this task, you must be running Db2 Big SQL Version 7.1.1 or later.

Supported upgrade paths: If you are running 7.1.1 or later, you can upgrade to Versions 4.6.0 - 4.6.2.
Unsupported upgrade paths: You cannot upgrade from Version 3.5 to Version 4.6.3 or later. You must upgrade to 4.6.2 before you upgrade to 4.6.3 or later.

What permissions do you need to complete this task?

The permissions that you need depend on which tasks you must complete:

To create the Db2 Big SQL operators, you must have the appropriate permissions to create operators and you must be an administrator of the project where the Cloud Pak for Data operators are installed. This project is identified by the ${PROJECT_CPD_OPS} environment variable.
To upgrade Db2 Big SQL, you must be an administrator of the project where Db2 Big SQL is installed. This project is identified by the ${PROJECT_CPD_INSTANCE} environment variable.

When do you need to complete this task?: If you didn't upgrade Db2 Big SQL when you upgraded the platform, you can complete this task to upgrade your existing Db2 Big SQL installation.
If you want to upgrade all of the Cloud Pak for Data components at the same time, follow the process in Upgrading the platform and services instead.

Important: All of the Cloud Pak for Data components in a deployment must be installed at the same release.

Information you need to complete this task

Review the following information before you upgrade Db2 Big SQL:

Environment variables

The commands in this task use environment variables so that you can run the commands exactly as written.

If you don't have the script that defines the environment variables, see Setting up installation environment variables.
To use the environment variables from the script, you must source the environment variables before you run the commands in this task, for example:
```
source ./cpd_vars.sh
```

Security context constraint requirements: Db2 Big SQL uses a custom security context constraint (SCC). For details, see Creating the custom SCC for Db2 Big SQL.

Installation location: Db2 Big SQL is installed in the same project (namespace) as the Cloud Pak for Data control plane. This project is identified by the ${PROJECT_CPD_INSTANCE} environment variable.

Storage requirements: You don't need to specify storage when you upgrade Db2 Big SQL.

Before you begin

This task assumes that the following prerequisites are met:

Prerequisite	Where to find more information
The cluster meets the minimum requirements for Db2 Big SQL.	If this task is not complete, see System requirements.
The workstation from which you will run the upgrade is set up as a client workstation and includes the following command-line interfaces: Cloud Pak for Data CLI: `cpd-cli` OpenShift® CLI: `oc`	If this task is not complete, see Setting up a client workstation.
The Cloud Pak for Data control plane is upgraded.	If this task is not complete, see Upgrading the platform and services.
For environments that use a private container registry, such as air-gapped environments, the Db2 Big SQL software images are mirrored to the private container registry.	If this task is not complete, see Mirroring images to a private container registry.
The custom security context constraint (SCC) required to run Db2 Big SQL exists.	If this task is not complete, see Creating the custom SCC for Db2 Big SQL.

Preparing for the upgrade

Do the following steps before you upgrade Db2 Big SQL.

Follow the procedure to install the latest version of Db2 Big SQL in the upgraded Cloud Pak for Data control plane.
Important: Do not provision the Db2 Big SQL instance from the Cloud Pak for Data UI. Stop after the Db2 Big SQL service pods (bigsql-addon and bigsql-service-provider) are up and running.
Download and extract the Db2 Big SQL migration tarball.
1. In your browser, log in to the Cloud Pak for Data web client.
2. Download the tarball by copying the following URL in a browser:
```
https://Cloud Pak for Data web client URL/icp4data-addon/bigsql/add-ons/upgrade.tgz
```
3. Copy the tarball to your OpenShift infrastructure node.
4. Extract the tarball:
```
tar -xzf upgrade.tgz
```

Procedure

Complete the following tasks to upgrade Db2 Big SQL:

Logging in to the cluster
Installing the operator
Upgrading the service
Validating the upgrade
Upgrading existing service instances
Verifying the upgraded Db2 Big SQL instance
What to do next

Logging in to the cluster

To run cpd-cli manage commands, you must log in to the cluster.

To log in to the cluster:

Run the cpd-cli manage login-to-ocp command to log in to the cluster as a user with sufficient permissions to complete this task. For example:
```
cpd-cli manage login-to-ocp \
--username=${OCP_USERNAME} \
--password=${OCP_PASSWORD} \
--server=${OCP_URL}
```
Tip: The login-to-ocp command takes the same input as the oc login command. Run oc login --help for details.

Installing the operator

The Db2 Big SQL operator simplifies the process of managing the Db2 Big SQL service on Red Hat® OpenShift Container Platform.

To upgrade Db2 Big SQL, you must install the Db2 Big SQL operator and create the Operator Lifecycle Manager (OLM) objects, such as the catalog source and subscription, for the operator.

Who needs to complete this task?: You must be a cluster administrator (or a user with the appropriate permissions to install operators) to create the OLM objects.

When do you need to complete this task?: Complete this task if the Db2 Big SQL operator and other OLM artifacts have not been created for the current release.
It is not necessary to run this command multiple times for each service that you plan to upgrade. If you complete this task and the OLM artifacts already exist on the cluster, the cpd-cli will recreate the OLM objects for all of the existing components in the ${PROJECT_CPD_OPS} project.

To install the operator:

Create the OLM objects for Db2 Big SQL:
```
cpd-cli manage apply-olm \
--release=${VERSION} \
--cpd_operator_ns=${PROJECT_CPD_OPS} \
--components=bigsql
```
- If the command succeeds, it returns [SUCCESS]... The apply-olm command ran successfully.
- If the command fails, it returns [ERROR] and includes information about the cause of the failure.

What to do next: Upgrade the Db2 Big SQL service.

Upgrading the service

After the Db2 Big SQL operator is installed, you can upgrade Db2 Big SQL.

Who needs to complete this task?: You must be an administrator of the project where Db2 Big SQL is installed.

When do you need to complete this task?: Complete this task for each instance of Db2 Big SQL that is associated with an instance of Cloud Pak for Data Version 4.6.

To upgrade the service:

Create the custom resource for Db2 Big SQL.

cpd-cli manage apply-cr \
--components=bigsql \
--release=${VERSION} \
--cpd_instance_ns=${PROJECT_CPD_INSTANCE} \
--license_acceptance=true

Validating the upgrade

Db2 Big SQL is upgraded when the apply-cr command returns [SUCCESS]... The apply-cr command ran successfully.

However, you can optionally run the cpd-cli manage get-cr-status command if you want to confirm that the custom resource status is Completed:

cpd-cli manage get-cr-status \
--cpd_instance_ns=${PROJECT_CPD_INSTANCE} \
--components=bigsql

Upgrading existing service instances

After you upgrade Db2 Big SQL, you must upgrade any service instances that are associated with Db2 Big SQL.

To upgrade the service instances:

Scale down the Db2 Big SQL instance to one worker.
To simplify the upgrade process, scale down the Db2 Big SQL instance to one worker.
```
oc scale StatefulSet bigsql-<old_instance_id>-worker --replicas=1
```

Export users:

python bigsql_user_migration.py export <Cloud_Pak_for_Data_cluster_hostname> <old_instance_id> <exported_data_filepath> <admin_username> <admin_password>

Edit the values.template file in the templates subdirectory from the extracted tarball and change the following settings:

serviceInstanceDisplayName

This value must be a string in the form Db2-Big-SQL-N, where N is an integer. Use 1 for the first instance that you are migrating, 2 for the second instance, and so on.

headStorageSize

This value must be a PersistentVolume capacity specification, the bigger value of the current capacity allocated to the Db2 Big SQL Head PersistentVolume on one hand and the Db2 Big SQL Workers PersistentVolumes on the other. You can check these capacities by running the oc get pvc | grep bigsql-<old_instance_id> command. Make sure to include the capacity unit (for example, Gi, after the numeric value).

storageClassName

This value must be the name of the storage class to use by the persistent volumes of the upgraded instance. In most cases, the name is the same as the currently used volume (run oc get pvc | grep bigsql-<old_instance_id> to identify it). But if you are using Portworx storage, the storage class that is supported by Db2 Big SQL changed in Cloud Pak for Data 4.5.0 to portworx-db2-rwx-sc.

For example:
```
serviceInstanceDisplayName: Db2-Big-SQL-1
headStorageSize: 500Gi
storageClassName: managed-nfs-storage
```

Run the migration script.

./bigsql-migration.sh -instance <instance_id>

Note: If you are upgrading Db2 Big SQL for the first time or previous upgrades did not fail, you might see the following error messages.

error: deployments/bigsql-<old_instance_id>-hcat-db volume "bigsql-backup" not found
error: deployments/bigsql-<old_instance_id>-head volume "bigsql-backup" not found

You can ignore these messages.

Provision Db2 Big SQL:

Note: The following scripts are generated by bigsql-migration.sh in the same directory.

oc apply -f bigsql-cr-<old_instance_id>-<new_instance_id>.yaml
oc apply -f bigsql-cm-<old_instance_id>-<new_instance_id>.yaml

Verifying the upgraded Db2 Big SQL instance

To verify that a Db2 Big SQL instance was successfully upgraded, do the following steps.

Identify the Db2 Big SQL instance ID by running the following command:

oc get cm -l component=db2bigsql -o custom-columns="Instance Id:{.data.instance_id},Instance Name:{.data.instance_name},Created:{.metadata.creationTimestamp}"

To check that the upgrade is complete, check that the custom resource is in a Ready state by running the following command:
```
oc get bigsql bigsql-<instance_id>
```

To confirm that the upgrade was successful and the cluster is operational, run a smoke test as the db2inst1 user:

head_pod=$(oc get pod -l app=bigsql-<instance_id>,name=dashmpp-head-0 --no-headers=true -o=custom-columns=NAME:.metadata.name)

# If connected to a Hadoop cluster 
oc exec -it $head_pod -- bash -lc '/usr/ibmpacks/current/bigsql/bigsql/install/bigsql-smoke.sh'

# If connected exclusively to an Object Store service, you must provide the name of a bucket that exists on the storage service to execute the smoke test
oc exec -it $head_pod -- bash -lc '/usr/ibmpacks/current/bigsql/bigsql/install/bigsql-smoke.sh -o<bucket_name>'

To further validate the operational state of the cluster, connect an application or JDBC client to it to execute workloads.
1. You must first disable access to the previous Db2 Big SQL instance and enable access to the newly installed one. Note that you are only disabling access to the previous instance. The instance itself and the data it holds remain in the cluster.
  To disable access over JDBC to the previous Db2 Big SQL instance, run the following command:
```
oc delete service bigsql-<old_instance_id>-jdbc
```
2. Enable access to the new instance.
You can now connect to the new instance to run workloads.

If an upgrade failed, retry the upgrade by first restoring the Version 3.5 instances.

Restore the previous instance:

newInstanceId=<new_instance_id>

oc delete bigsql bigsql-${newInstanceId}

oc delete cm -l app=bigsql-${newInstanceId}

Scale up the head, Hive metastore database, and worker nodes for the old instance:

oldInstanceId=<old_instance_id>

oc scale deploy bigsql-${oldInstanceId}-head --replicas=1
oc scale deploy bigsql-${oldInstanceId}-hcat-db --replicas=1
oc scale sts bigsql-${oldInstanceId}-worker --replicas=1

What to do next

If you made any custom configuration changes in the version 3.5 release, review them. If they are still needed, reapply them after the upgrade is completed.

Because of differences between Db2 Big SQL 7.1.1 and Db2 Big SQL 7.3.x, some additional actions must be taken after the instance upgrade is completed.

The instance user is now db2inst1 instead of bigsql. The bigsql user will no longer exist after the upgrade, and therefore this user cannot be used to connect to Db2 Big SQL. If external applications are using the bigsql user ID to connect, they must be updated to use the db2inst1 one instead.
After Db2 Big SQL is upgraded to 7.4.4, the URL that is used for external connections to Db2 Big SQL is different than in 7.1.1. Refer to Setting up a connection to Db2 Big SQL to understand how to determine the new URL post-upgrade for remote connections to Db2 Big SQL, and update the configuration of existing applications connecting to Db2 Big SQL to use the new URL.
After Db2 Big SQL is upgraded, a new certificate is used for making SSL connections to Db2 Big SQL. You can download the new SSL certificate from the Db2 Big SQL instance page in the IBM Cloud Pak for Data web interface to update the configuration of applications connecting to Db2 Big SQL.

Import users by running the following command:

python bigsql_user_migration.py import <Cloud_Pak_for_Data_cluster_hostname> <new_instance_id> <exported_data_filepath> <admin_username> <admin_password>

Optional: After you have upgraded your Db2 Big SQL instances, delete the version 3.5 instances.

Important: These steps are not reversible. These steps remove instances and their associated data.

For each Db2 Big SQL instance that was successfully upgraded, run the following steps to delete it and its associated resources.

In the project where the instance is deployed, delete the services that are associated with the instance:
```
oc delete service bigsql-<old_instance_id> bigsql-<old_instance_id>-jdbc
```

In the project where the instance is deployed, delete the deployment objects that are associated with the instance:

oc delete deployment bigsql-<old_instance_id>-head 
oc delete deployment bigsql-<old_instance_id>-hcat-db --ignore-not-found=true
oc delete StatefulSet bigsql-<old_instance_id>-worker

Delete any remaining the PersistentVolumeClaim (PVC) that is associated with the instance:

# Delete any Db2 Big SQL 7.1.1 backup pod remnant:
oc get pods -l app.kubernetes.io/instance=<old_instance_id>,app.kubernetes.io/component=backup
oc delete pod <pod-returned-by-previous-command>

oc delete pvc bigsql-<old_instance_id>-head 
oc delete pvc bigsql-<old_instance_id>-hcat-db --ignore-not-found=true
oc delete pvc -l app=db2-bigsql,app.kubernetes.io/component=worker,app.kubernetes.io/instance=<old_instance_id>

When all Db2 Big SQL instances are successfully removed, remove the administrative objects that are associated with the Db2 Big SQL service:
```
oc delete scc bigsql 
oc delete sa bigsql
```
If you want to retain the same port allocations in the new instance, run the following command:
```
oc apply -f node-port-<old_instance_id>-<new_instance_id>.yaml
```