Upgrading DataStage from Version 4.7 to Version 5.0
An instance administrator can upgrade DataStage from Cloud Pak for Data Version 4.7 to Version 5.0.
- Who needs to complete this task?
-
Instance administrator To upgrade DataStage, you must be an instance administrator. An instance administrator has permission to manage software in the following projects:
- The operators project for the instance
-
The operators for this instance of DataStage are installed in the operators project. In the upgrade commands, the
${PROJECT_CPD_INST_OPERATORS}
environment variable refers to the operators project. - The operands project for the instance
-
The custom resources for the control plane and DataStage are installed in the operands project. In the upgrade commands, the
${PROJECT_CPD_INST_OPERANDS}
environment variable refers to the operands project.
- When do you need to complete this task?
-
Review the following options to determine whether you need to complete this task:
- If you want to upgrade the control plane and one or more services at the same time, follow the process in Upgrading an instance of Cloud Pak for Data instead.
- If you didn't upgrade DataStage when you upgraded the control plane, complete this task to upgrade DataStage.
Repeat as needed If you are responsible for multiple instances of Cloud Pak for Data, you can repeat this task to upgrade more instances of DataStage on the cluster.
Information you need to complete this task
Review the following information before you upgrade DataStage:
- Version requirements
-
All the components that are associated with an instance of Cloud Pak for Data must be installed at the same release. For example, if the Cloud Pak for Data control plane is at Version 5.0.3, you must upgrade DataStage to Version 5.0.3.
- Environment variables
- The commands in this task use environment variables so that you can run the commands exactly as
written.
- If you don't have the script that defines the environment variables, see Setting up installation environment variables.
- To use the environment variables from the script, you must source the environment variables
before you run the commands in this task. For example,
run:
source ./cpd_vars.sh
- Common core services
- DataStage requires the Cloud Pak for Data
common core services.
If the common core services are not at the correct version in the operands project for the instance, the common core services are automatically upgraded when you upgrade DataStage. The common core services upgrade increases the amount of time the upgrade takes to complete.
- Storage requirements
- Specify the storage that you use in your existing installation. You cannot change the storage that is associated with DataStage during an upgrade. Ensure that the environment variables point to the correct storage classes for your environment.
Before you begin
This task assumes that the following prerequisites are met:
Prerequisite | Where to find more information |
---|---|
The cluster meets the minimum requirements for DataStage. | If this task is not complete, see System requirements. |
The workstation from which you will run the upgrade is set up as a client workstation and
has the following command-line interfaces:
|
If this task is not complete, see Updating client workstations. |
The Cloud Pak for Data control plane is upgraded. | If this task is not complete, see Upgrading an instance of Cloud Pak for Data. |
For environments that use a private container registry, such as air-gapped environments, the DataStage software images are mirrored to the private container registry. | If this task is not complete, see Mirroring images to a private container registry. |
For environments that use a private container registry, such as air-gapped environments,
the cpd-cli is configured to pull the olm-utils-v3 image from the private container registry. |
If this task is not complete, see Pulling the olm-utils-v3 image from the private container registry. |
Procedure
Complete the following tasks to upgrade DataStage:
Specifying your DataStage edition
DataStage is available in two different editions: DataStage Enterprise and DataStage Enterprise Plus. You must specify which edition to install.
- For DataStage Enterprise, run
export DATASTAGE_TYPE=datastage_ent
- For DataStage Enterprise Plus, run
export DATASTAGE_TYPE=datastage_ent_plus
Upgrading the service
cpd-cli
manage
apply-olm
updates all of the OLM objects in the operators project
at the same time.To upgrade DataStage:
-
Log the
cpd-cli
in to the Red Hat® OpenShift Container Platform cluster:${CPDM_OC_LOGIN}
Remember:CPDM_OC_LOGIN
is an alias for thecpd-cli manage login-to-ocp
command. - Update the custom resource for DataStage.
The command that you run depends on the storage on your cluster:
Red Hat OpenShift Data Foundation storage
Run the following command to create the custom resource.
cpd-cli manage apply-cr \ --components=${DATASTAGE_TYPE} \ --release=${VERSION} \ --cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS} \ --block_storage_class=${STG_CLASS_BLOCK} \ --file_storage_class=${STG_CLASS_FILE} \ --license_acceptance=true \ --upgrade=true
IBM Storage Fusion Data Foundation storage
Run the following command to create the custom resource.
cpd-cli manage apply-cr \ --components=${DATASTAGE_TYPE} \ --release=${VERSION} \ --cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS} \ --block_storage_class=${STG_CLASS_BLOCK} \ --file_storage_class=${STG_CLASS_FILE} \ --license_acceptance=true \ --upgrade=true
IBM Storage Scale Container Native storage
Remember: When you use IBM Storage Scale Container Native storage, both${STG_CLASS_BLOCK}
and${STG_CLASS_FILE}
point to the same storage class, typicallyibm-spectrum-scale-sc
.Run the following command to create the custom resource.
cpd-cli manage apply-cr \ --components=${DATASTAGE_TYPE} \ --release=${VERSION} \ --cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS} \ --block_storage_class=${STG_CLASS_BLOCK} \ --file_storage_class=${STG_CLASS_FILE} \ --license_acceptance=true \ --upgrade=true
Portworx storage
Run the following command to create the custom resource.
cpd-cli manage apply-cr \ --components=${DATASTAGE_TYPE} \ --release=${VERSION} \ --cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS} \ --storage_vendor=portworx \ --license_acceptance=true \ --upgrade=true
NFS storage
Remember: When you use NFS storage, both${STG_CLASS_BLOCK}
and${STG_CLASS_FILE}
point to the same storage class, typicallymanaged-nfs-storage
.Run the following command to create the custom resource.
cpd-cli manage apply-cr \ --components=${DATASTAGE_TYPE} \ --release=${VERSION} \ --cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS} \ --block_storage_class=${STG_CLASS_BLOCK} \ --file_storage_class=${STG_CLASS_FILE} \ --license_acceptance=true \ --upgrade=true
AWS with EFS storage only
Remember: When you use EFS storage, both${STG_CLASS_BLOCK}
and${STG_CLASS_FILE}
point to the same storage class, typicallyefs-nfs-client
.Run the following command to create the custom resource.
cpd-cli manage apply-cr \ --components=${DATASTAGE_TYPE} \ --release=${VERSION} \ --cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS} \ --block_storage_class=${STG_CLASS_BLOCK} \ --file_storage_class=${STG_CLASS_FILE} \ --license_acceptance=true \ --upgrade=true
AWS with EFS and EBS storage
Run the following command to create the custom resource.
cpd-cli manage apply-cr \ --components=${DATASTAGE_TYPE} \ --release=${VERSION} \ --cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS} \ --block_storage_class=${STG_CLASS_BLOCK} \ --file_storage_class=${STG_CLASS_FILE} \ --license_acceptance=true \ --upgrade=true
NetApp Trident
Remember: When you use NetApp Trident storage, both${STG_CLASS_BLOCK}
and${STG_CLASS_FILE}
point to the same storage class.Run the following command to create the custom resource.
cpd-cli manage apply-cr \ --components=${DATASTAGE_TYPE} \ --release=${VERSION} \ --cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS} \ --block_storage_class=${STG_CLASS_BLOCK} \ --file_storage_class=${STG_CLASS_FILE} \ --license_acceptance=true \ --upgrade=true
Validating the upgrade
apply-cr
command
returns:[SUCCESS]... The apply-cr command ran successfully
If you want to confirm that the custom resource status is
Completed
, you can run the cpd-cli
manage
get-cr-status
command:
Upgrading existing service instances
- Update the PXRuntime custom resources to add the
upgrade_force
flag.oc -n ${PROJECT_CPD_INST_OPERANDS} get pxruntime \ | awk 'NR>1 { print $1 }' \ | xargs -I % oc -n ${PROJECT_CPD_INST_OPERANDS} patch pxruntime % --type=merge -p '{"spec":{"upgrade_force": true}}'
- Check the status of the custom
resources.
Confirm that the status isoc -n ${PROJECT_CPD_INST_OPERANDS} get pxruntime
Completed
and that the values ofVERSION
andRECONCILED
are the same. For example:NAME VERSION RECONCILED STATUS AGE ds-px-default 5.0.0 5.0.0 Completed 2d14h
- After the upgrade is complete, remove the added
field.
oc -n ${PROJECT_CPD_INST_OPERANDS} get pxruntime \ | awk 'NR>1 { print $1 }' \ | xargs -I % oc -n ${PROJECT_CPD_INST_OPERANDS} patch pxruntime % --type='json' -p='[{"op": "remove", "path": "/spec/upgrade_force"}]'
If you are upgrading from IBM Cloud Pak for Data Version 4.7.2 or later, or if you are upgrading to Version 5.0.1 or later, the service instances are automatically upgraded when you upgrade DataStage.
What to do next
The service is ready to use. See Transforming data.