Upgrading Execution Engine for Apache Hadoop from Version 5.0.x to a later 5.0 refresh
An instance administrator can upgrade Execution Engine for Apache Hadoop from Cloud Pak for Data Version 5.0.x to a later 5.0 refresh.
- Who needs to complete this task?
-
Instance administrator To upgrade Execution Engine for Apache Hadoop, you must be an instance administrator. An instance administrator has permission to manage software in the following projects:
- The operators project for the instance
-
The operators for this instance of Execution Engine for Apache Hadoop are installed in the operators project. In the upgrade commands, the
${PROJECT_CPD_INST_OPERATORS}environment variable refers to the operators project. - The operands project for the instance
-
The custom resources for the control plane and Execution Engine for Apache Hadoop are installed in the operands project. In the upgrade commands, the
${PROJECT_CPD_INST_OPERANDS}environment variable refers to the operands project.
- When do you need to complete this task?
-
Review the following options to determine whether you need to complete this task:
- If you want to upgrade the Cloud Pak for Data control plane and one or more services at the same time, follow the process in Upgrading an instance of Cloud Pak for Data instead.
- If you didn't upgrade Execution Engine for Apache
Hadoop when you upgraded the Cloud Pak for Data control
plane, complete this task to upgrade Execution Engine for Apache
Hadoop.
Repeat as needed If you are responsible for multiple instances of Cloud Pak for Data, you can repeat this task to upgrade more instances of Execution Engine for Apache Hadoop on the cluster.
Information you need to complete this task
Review the following information before you upgrade Execution Engine for Apache Hadoop:
- Version requirements
-
All the components that are associated with an instance of Cloud Pak for Data must be installed at the same release. For example, if the Cloud Pak for Data control plane is at Version 5.0.3, you must upgrade Execution Engine for Apache Hadoop to Version 5.0.3.
- Environment variables
- The commands in this task use environment variables so that you can run the commands exactly as
written.
- If you don't have the script that defines the environment variables, see Setting up installation environment variables.
- To use the environment variables from the script, you must source the environment variables
before you run the commands in this task. For example,
run:
source ./cpd_vars.sh
- Common core services
- Execution Engine for Apache
Hadoop requires the Cloud Pak for Data
common core services.
If the common core services are not at the correct version in the operands project for the instance, the common core services are automatically upgraded when you upgrade Execution Engine for Apache Hadoop. The common core services upgrade increases the amount of time the upgrade takes to complete.
- Storage requirements
- Specify the storage that you use in your existing installation. You cannot change the storage that is associated with Execution Engine for Apache Hadoop during an upgrade. Ensure that the environment variables point to the correct storage classes for your environment.
Before you begin
This task assumes that the following prerequisites are met:
| Prerequisite | Where to find more information |
|---|---|
| The cluster meets the minimum requirements for Execution Engine for Apache Hadoop. | If this task is not complete, see System requirements. |
The workstation from which you will run the upgrade is set up as a client workstation and
the following command-line interfaces:
|
If this task is not complete, see Updating client workstations. |
| The Cloud Pak for Data control plane is upgraded. | If this task is not complete, see Upgrading an instance of Cloud Pak for Data. |
| For environments that use a private container registry, such as air-gapped environments, the Execution Engine for Apache Hadoop software images are mirrored to the private container registry. | If this task is not complete, see Mirroring images to a private container registry. |
For environments that use a private container registry, such as air-gapped environments,
the cpd-cli is configured to pull the olm-utils-v3 image from the private container registry. |
If this task is not complete, see Pulling the olm-utils-v3 image from the private container registry. |
Prerequisite services
Before you upgrade Execution Engine for Apache Hadoop, ensure that the following services are upgraded and running:
Procedure
Complete the following tasks to upgrade Execution Engine for Apache Hadoop:
Upgrading the service
cpd-cli
manage
apply-olm updates all of the OLM objects in the operators project
at the same time.To upgrade Execution Engine for Apache Hadoop:
-
Log the
cpd-cliin to the Red Hat® OpenShift Container Platform cluster:${CPDM_OC_LOGIN}Remember:CPDM_OC_LOGINis an alias for thecpd-cli manage login-to-ocpcommand. - Update the custom resource for Execution Engine for Apache
Hadoop.
The command that you run depends on the storage on your cluster:
Red Hat OpenShift Data Foundation storage
Run the following command to create the custom resource.
cpd-cli manage apply-cr \ --components=hee \ --release=${VERSION} \ --cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS} \ --block_storage_class=${STG_CLASS_BLOCK} \ --file_storage_class=${STG_CLASS_FILE} \ --license_acceptance=true \ --upgrade=true
IBM Storage Fusion Data Foundation storage
Run the following command to create the custom resource.
cpd-cli manage apply-cr \ --components=hee \ --release=${VERSION} \ --cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS} \ --block_storage_class=${STG_CLASS_BLOCK} \ --file_storage_class=${STG_CLASS_FILE} \ --license_acceptance=true \ --upgrade=true
IBM Storage Fusion Global Data Platform storage
Remember: When you use IBM Storage Fusion storage, both${STG_CLASS_BLOCK}and${STG_CLASS_FILE}point to the same storage class, typicallyibm-spectrum-scale-scoribm-storage-fusion-cp-sc.Run the following command to create the custom resource.
cpd-cli manage apply-cr \ --components=hee \ --release=${VERSION} \ --cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS} \ --block_storage_class=${STG_CLASS_BLOCK} \ --file_storage_class=${STG_CLASS_FILE} \ --license_acceptance=true \ --upgrade=true
IBM Storage Scale Container Native storage
Remember: When you use IBM Storage Scale Container Native storage, both${STG_CLASS_BLOCK}and${STG_CLASS_FILE}point to the same storage class, typicallyibm-spectrum-scale-sc.Run the following command to create the custom resource.
cpd-cli manage apply-cr \ --components=hee \ --release=${VERSION} \ --cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS} \ --block_storage_class=${STG_CLASS_BLOCK} \ --file_storage_class=${STG_CLASS_FILE} \ --license_acceptance=true \ --upgrade=true
Portworx storage
Run the following command to create the custom resource.
cpd-cli manage apply-cr \ --components=hee \ --release=${VERSION} \ --cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS} \ --storage_vendor=portworx \ --license_acceptance=true \ --upgrade=true
NFS storage
Remember: When you use NFS storage, both${STG_CLASS_BLOCK}and${STG_CLASS_FILE}point to the same storage class, typicallymanaged-nfs-storage.Run the following command to create the custom resource.
cpd-cli manage apply-cr \ --components=hee \ --release=${VERSION} \ --cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS} \ --block_storage_class=${STG_CLASS_BLOCK} \ --file_storage_class=${STG_CLASS_FILE} \ --license_acceptance=true \ --upgrade=true
AWS with EFS storage only
Remember: When you use EFS storage, both${STG_CLASS_BLOCK}and${STG_CLASS_FILE}point to the same storage class, typicallyefs-nfs-client.Run the following command to create the custom resource.
cpd-cli manage apply-cr \ --components=hee \ --release=${VERSION} \ --cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS} \ --block_storage_class=${STG_CLASS_BLOCK} \ --file_storage_class=${STG_CLASS_FILE} \ --license_acceptance=true \ --upgrade=true
AWS with EFS and EBS storage
Run the following command to create the custom resource.
cpd-cli manage apply-cr \ --components=hee \ --release=${VERSION} \ --cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS} \ --block_storage_class=${STG_CLASS_BLOCK} \ --file_storage_class=${STG_CLASS_FILE} \ --license_acceptance=true \ --upgrade=true
NetApp Trident
Remember: When you use NetApp Trident storage, both${STG_CLASS_BLOCK}and${STG_CLASS_FILE}point to the same storage class.Run the following command to create the custom resource.
cpd-cli manage apply-cr \ --components=hee \ --release=${VERSION} \ --cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS} \ --block_storage_class=${STG_CLASS_BLOCK} \ --file_storage_class=${STG_CLASS_FILE} \ --license_acceptance=true \ --upgrade=true
Validating the upgrade
apply-cr command
returns:[SUCCESS]... The apply-cr command ran successfully
If you want to confirm that the custom resource status is
Completed, you can run the cpd-cli
manage
get-cr-status command:
cpd-cli manage get-cr-status \
--cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS} \
--components=hee
What to do next
- Complete the post-upgrade tasks for the service.
- To get started with Execution Engine for Apache Hadoop, see Analyzing Apache Hadoop data.