IBM Support

Available patches for Analytics Engine Powered by Apache Spark for IBM Cloud Pak for Data

Preventive Service Planning


Abstract

This document lists the available patches for the Analytics Engine Powered by Apache Spark service on IBM Cloud Pak for Data.

Content

Use the following links to locate the patches for each Cloud Pak for Data version:
Cloud Pak for Data 3.0.1 patches
cpd-3.0.1-spark-patch-5
Patch name cpd-3.0.1-spark-patch-5
Released on 14 January 2020
Service assembly spark
Applies to service version Analytics Engine for Apache Spark 3.0.1
Applies to platform version Cloud Pak for Data 3.0.1
Patch type
Cumulative
Description
The patch includes the following fixes:
  •  Fix for Data Refinery Jobs failing with error "Error getting service instance id using HB instance id."
Instructions
Important: After you apply the patch, it might take two hours for the newer Spark 2.4.7 images to be pulled on the worker nodes.
Once the images are pulled, stop and start the history server so that it can use the newer Spark 2.4.7 images.
cpd-3.0.1-spark-patch-4
Patch name cpd-3.0.1-spark-patch-4
Released on 14 December 2020
Service assembly spark
Applies to service version Analytics Engine for Apache Spark 3.0.1
Applies to platform version Cloud Pak for Data 3.0.1
Patch type
Cumulative
Description
The patch includes the following fixes:
  • Updated project-lib for Python and R to 2.0.0.
  • Security fixes.
  • Support for Python 3.7.
  • Upgraded Tensorflow and its dependencies.
Instructions
Important: After you apply the patch, it might take two hours for the newer Spark 2.4.6 images to be pulled on the worker nodes.
Once the images are pulled, stop and start the history server so that it can use the newer Spark 2.4.6 images.
cpd-3.0.1-spark-patch-3
Patch name cpd-3.0.1-spark-patch-3
Released on 7 September 2020
Service assembly spark
Applies to service version Analytics Engine for Apache Spark 3.0.1
Applies to platform version Cloud Pak for Data 3.0.1
Patch type
Cumulative
Description
This patch fix supports OpenShift versions 3.11, 4.3 and 4.5.
The patch includes the following fixes:
  • Platform tokens are now supported for job API.
  • Fixes History server that redirected to the wrong URL.
Instructions
Important: After you apply the patch, it might take two hours for the newer Spark 2.4.6 images to be pulled on the worker nodes.
Once the images are pulled, stop and start the history server so that it can use the newer Spark 2.4.6 images.
cpd-3.0.1-spark-patch-2
Patch name cpd-3.0.1-spark-patch-2
Released on 14 August 2020
Service assembly spark
Applies to service version Analytics Engine for Apache Spark 3.0.1
Applies to platform version Cloud Pak for Data 3.0.1
Patch type
Additive
Prerequisite:
Description
This patch supports both the x86 and Power platforms.
The patch includes the following enhancements:
  • Access to a Spark instance using the platform token is now available.
  • New flag CHECK_WORKER_EVENTS_ENABLED to disable check for Kubernetes events while a job is submitted. By default, the flag is set to true. If you're in an environment where PVCs are not mounted in the first few tries, set this flag to false.

    Note: If the CHECK_WORKER_EVENTS_ENABLED flag is set to false and there are not enough resources in the cluster for spark-master/jkg pods, then the following error will occur:

    curl: (52) Empty reply from server.
     
The patch includes the following fixes to the Spark history server:
  • Clicking on the top left Spark icon on the History server now navigates to the Spark history home.
  • Intermittently clicking on the Spark application no longer leads to time-outs due to buggy re-writes.
Instructions
Before applying the patch, perform the following tasks:
  1. Delete the existing spark-hb-nginx-configmap and hummingbird-route configmaps using the following commands:
    oc delete cm spark-hb-nginx-configmap
    oc delete cm hummingbird-route
  2. Download the following yaml files:
  3. After you download the files, replace the OC_PROJECT_NAMESPACE and NGINX_RESOLVER values in the yaml files in the following commands.  After you replace the values, run the commands separately on a cluster, where you'll be applying the patch.
    sed -i -e 's/OC_PROJECT_NAMESPACE/<replace with oc 
    namespace where spark installed ex. zen>/g' -e 's/
    NGINX_RESOLVER/<replace with nginx resolver value 
    mentioned above>/g' spark-hb-nginx-configmap.yaml
    sed -i -e 's/OC_PROJECT_NAMESPACE/<replace with oc 
    namespace where spark installed ex. zen>/g' 
    hummingbird-route.yaml

    Important: For the OC_PROJECT_NAMESPACE value, replace it with the oc namespace where Spark is installed.

    For the NGINX_RESOLVER value,  use the appropriate value based on the OpenShift Container platform version:

    • OCP 4.3 Nginx resolver value: dns-default.openshift-dns
    • OCP 3.11 Nginx resolver value: kubernetes.default
  4. Create the configmaps using the following commands:
    oc create -f spark-hb-nginx-configmap.yaml
    oc create -f hummingbird-route.yaml
  5. Use the following commands to confirm that the configmaps were created successfully:
    oc get cm | grep spark-hb-nginx-configmap
    oc get cm | grep hummingbird-route
  6. See Applying patches for additional instructions.
cpd-3.0.1-spark-patch-1
Patch name cpd-3.0.1-spark-patch-1
Released on 26 June 2020
Service assembly spark
Applies to service version Analytics Engine for Apache Spark 3.0.1
Applies to platform version Cloud Pak for Data 3.0.1
Patch type
Cumulative
Description
The patch includes the following fixes:
  • Remove Spark 2.3 and upgrade the Spark version from Spark 2.4.4 to Spark 2.4.6 to resolve the jackson-databind vulnerability issue.
  • install.packages() function in a R spark environment uses the right repository as default. 
  • OpenJDK security vulnerabilities.
  • Data Refinery sending correct account ID for multiple users using same project.

Instructions
Important: After you apply the patch, it might take two hours for the newer Spark 2.4.6 images to be pulled on the worker nodes.
Once the images are pulled, stop and start the history server so that it can use the newer Spark 2.4.6 images.
Cloud Pak for Data 2.5.0 patches
cpd-2.5.0.0-spark-patch-2
Patch name cpd-2.5.0.0-spark-patch-2
Released on 23 June 2020
Service assembly spark
Applies to service version Analytics Engine for Apache Spark 2.5.0
Applies to platform version Cloud Pak for Data 2.5
Patch type
Cumulative
Prerequisites:
Description
The patch includes the following fixes:
  • Using connected data from some connectors no longer cause errors on a job run.
Instructions See Applying patches.
cpd-2.5.0.0-spark-patch-1
Patch name cpd-2.5.0.0-spark-patch-1
Released on 10 June 2020
Service assembly spark
Applies to service version Analytics Engine for Apache Spark 2.5.0
Applies to platform version Cloud Pak for Data 2.5
Patch type
Cumulative
Prerequisites:
Description
The patch includes the following fixes:
  • Data Refinery jobs run correctly with the Refinery operations that were applied to the result file and Japanese data is displayed correctly.
  • Data Refinery jobs with an output to Oracle DB execute on Default Spark 2.4 and R 3.6.
Instructions See Applying patches.

[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SSHGYS","label":"IBM Cloud Pak for Data"},"ARM Category":[{"code":"a8m0z000000GpDZAA0","label":"Services and Integrations"}],"ARM Case Number":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Version(s)","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
15 January 2021

UID

ibm15693756