IBM Support

Available patches for Analytics Engine Powered by Apache Spark for IBM Cloud Pak for Data

Preventive Service Planning


Abstract

This document lists the available patches for the Analytics Engine Powered by Apache Spark service on IBM Cloud Pak for Data. Not all versions include patches. Only versions that include patches are shown.

Content

Use the following links to locate the patches for each Cloud Pak for Data version:
Cloud Pak for Data 4.6.0 patches
Ensure that you apply patches for the version of Cloud Pak for Data that is running on your environment:
 
4.6.3 patches
cpd-spark-4-6-3-patch-1
Patch name cpd-spark-4.6.3-patch-1
Released on 21 February 2023
Service assembly spark
Applies to service version Analytics Engine for Apache Spark 4.6.3
Applies to platform version Cloud Pak for Data 4.6.3
Patch type
Image patch (Defect fix)
Description
The patch includes a fix for:
  • Connectivity issues for platform connections, which impact creating business terms or running Auto Discovery when using the Azure Data Lake Storage connection for example.
Validating the patch
You can check if the patch was applied correctly by running the following commands:
  1. Check that the AnalyticsEngine CR status is in "Completed" state by running the following command:
    oc get AnalyticsEngine -o yaml -n ${PROJECT_CPD_INSTANCE}
  2. Verify that the correct digests were deployed in the cluster using the following commands:
    oc get deploy/spark-hb-control-plane -n ${PROJECT_CPD_INSTANCE}  -o=jsonpath={.spec.template.spec.containers..image}

    The output of the above command should contain: spark-hb-control-plane:sha256:69aec9952d04556094e1254a67f3846236e4f9038916f03fe68a4a0eb5de6d6e

    oc get cronjob/spark-hb-preload-jkg-image -n ${PROJECT_CPD_INSTANCE} -o=jsonpath={.spec.jobTemplate.spec.template.spec.containers..image}

    The output of the above command should contain: spark-hb-jkg@sha256:aea0ac214532225248f734e871667332c8d567cbe040050765519d0063d0b18a
     
  
Applying the patch

To apply the patch in an airgapped environment:
1. Download the image by running the following command. You need to provide the values for "<folder path>/auth.json" and "<local private registry>".
skopeo copy --all --authfile "<folder path>/auth.json" --dest-tls-verify=false --src-tls-verify=false docker://cp.icr.io/cp/cpd/spark-hb-control-plane@sha256:69aec9952d04556094e1254a67f3846236e4f9038916f03fe68a4a0eb5de6d6e <local private registry>/cp/cpd/spark-hb-control-plane@sha256:69aec9952d04556094e1254a67f3846236e4f9038916f03fe68a4a0eb5de6d6e
skopeo copy --all --authfile "<folder path>/auth.json" --dest-tls-verify=false --src-tls-verify=false docker://cp.icr.io/cp/cpd/spark-hb-jkg@sha256:aea0ac214532225248f734e871667332c8d567cbe040050765519d0063d0b18a <local private registry>/cp/cpd/spark-hb-jkg@sha256:aea0ac214532225248f734e871667332c8d567cbe040050765519d0063d0b18a
Prepare the authentication credentials to access the IBM production repository. Use the same auth.json file used for CASE download and image mirroring. An example directory path:
${PROJECT_CPD_INSTANCE}/.airgap/auth.json
Or create an auth.json file that contains credentials to access cp.icr.io and your local private registry. For example:
{
  "auths": {
    "cp.icr.io":{"email":"unused","auth":"<base64 encoded id:apikey>"},
    "<private registry hostname>":{"email":"unused","auth":"<base64 encoded id:password>"}
   }
 }
2. Run the following command to apply the hotfix image:
oc patch AnalyticsEngine analyticsengine-sample --namespace $PROJECT_CPD_INSTANCE --type merge --patch '{"spec": {"image_digests": {"spark-hb-control-plane":"sha256:69aec9952d04556094e1254a67f3846236e4f9038916f03fe68a4a0eb5de6d6e","spark-hb-jkg-v33": "sha256:aea0ac214532225248f734e871667332c8d567cbe040050765519d0063d0b18a"}}}'

What to do next

Make sure to revert the image overrides before you install or upgrade to a newer refresh or a major release of IBM® Cloud Pak for Data.

To revert the image overrides, run the following command to edit the Analytics Engine custom resource (ae):

oc patch AnalyticsEngine analyticsengine-sample --namespace ${PROJECT_CPD_INSTANCE} --type=json --patch '[{ "op": "remove", "path": "/spec/image_digests"}]'
 
 
After approximately 5-6 minutes, the spark-hb-control-plane pod should be up and running with the patched image.
cpd-spark-4-6-3-patch-2
Patch name cpd-spark-4.6.3-patch-2
Released on 15 March 2023
Service assembly spark
Applies to service version Analytics Engine for Apache Spark 4.6.3
Applies to platform version Cloud Pak for Data 4.6.3
Patch type
Image patch (Defect fix)
Description
The patch includes a fix for:
  • Connectivity issues for platform connections when using the Azure Data Lake Storage connection.
Validating the patch
You can check if the patch was applied correctly by running the following commands:
  1. Check that the AnalyticsEngine CR status is in "Completed" state by running the following command:
    oc get AnalyticsEngine -o yaml -n ${PROJECT_CPD_INSTANCE}
  2. Verify that the correct digests were deployed in the cluster by running the following two commands:
    oc get deploy/spark-hb-control-plane -n ${PROJECT_CPD_INSTANCE}  -o=jsonpath={.spec.template.spec.containers..image}
    The output of the above command should contain: spark-hb-control-plane:sha256:551bdb89f8bc2067bc1183d9e9f470aef064855def21851e6678845ded1a8d24

    oc get cronjob/spark-hb-preload-jkg-image -n ${PROJECT_CPD_INSTANCE} -o=jsonpath={.spec.jobTemplate.spec.template.spec.containers..image

    The output of the above command should contain: spark-hb-jkg@sha256:1425e65ed867bc09acdec14f351e604dc2cadef363607a60c5220103f341cd43
  
Applying the patch

To apply the patch in an airgapped environment:
1. Download the image by running the following command. You need to provide the values for "<folder path>/auth.json" and "<local private registry>".
skopeo copy --all --authfile "<folder path>/auth.json" --dest-tls-verify=false --src-tls-verify=false docker://cp.icr.io/cp/cpd/spark-hb-control-plane@sha256:551bdb89f8bc2067bc1183d9e9f470aef064855def21851e6678845ded1a8d24 <local private registry>/cp/cpd/spark-hb-control-plane@sha256:551bdb89f8bc2067bc1183d9e9f470aef064855def21851e6678845ded1a8d24
skopeo copy --all --authfile "<folder path>/auth.json" --dest-tls-verify=false --src-tls-verify=false docker://cp.icr.io/cp/cpd/spark-hb-jkg@sha256:1425e65ed867bc09acdec14f351e604dc2cadef363607a60c5220103f341cd43 <local private registry>/cp/cpd/spark-hb-jkg@sha256:1425e65ed867bc09acdec14f351e604dc2cadef363607a60c5220103f341cd43

 
Prepare the authentication credentials to access the IBM production repository. Use the same auth.json file used for CASE download and image mirroring. An example directory path:
${PROJECT_CPD_INSTANCE}/.airgap/auth.json
Or create an auth.json file that contains credentials to access cp.icr.io and your local private registry. For example:
{
  "auths": {
    "cp.icr.io":{"email":"unused","auth":"<base64 encoded id:apikey>"},
    "<private registry hostname>":{"email":"unused","auth":"<base64 encoded id:password>"}
   }
 }
2. Run the following command to apply the hotfix image:
oc patch AnalyticsEngine analyticsengine-sample --namespace $PROJECT_CPD_INSTANCE --type merge --patch '{"spec": {"image_digests": {"spark-hb-control-plane":"sha256:551bdb89f8bc2067bc1183d9e9f470aef064855def21851e6678845ded1a8d24","spark-hb-jkg-v33": "sha256:1425e65ed867bc09acdec14f351e604dc2cadef363607a60c5220103f341cd43"}}}'

What to do next

Make sure to revert the image overrides before you install or upgrade to a newer refresh or a major release of IBM® Cloud Pak for Data.

To revert the image overrides, run the following command to edit the Analytics Engine custom resource (ae):

oc patch AnalyticsEngine analyticsengine-sample --namespace ${PROJECT_CPD_INSTANCE} --type=json --patch '[{ "op": "remove", "path": "/spec/image_digests"}]'
 
 
After approximately 5-6 minutes, the spark-hb-control-plane pod should be up and running with the patched image.
cpd-spark-4-6-3-patch-3
Patch name cpd-spark-4.6.3-patch-3
Released on 16 March 2023
Service assembly spark
Applies to service version Analytics Engine for Apache Spark 4.6.3
Applies to platform version Cloud Pak for Data 4.6.3
Patch type
Image patch (Defect fix)
Description
The patch includes a fix for:
  • An incorrect job status returned when a Spark application V3 or V4 API fails.

     
Validating the patch
You can check if the patch was applied correctly by running the following commands:
  1. Check that the AnalyticsEngine CR status is in "Completed" state by running the following command:
    oc get AnalyticsEngine -o yaml -n ${PROJECT_CPD_INSTANCE}
  2. Verify that the correct digest was deployed in the cluster by running the following command:
    oc get deploy/spark-hb-deployer-agent -n ${PROJECT_CPD_INSTANCE}  -o=jsonpath={.spec.template.spec.containers..image}
    The output of the above command should contain: spark-hb-helm-repo:sha256:8a7989ff037834f7d985a2f95806d5f64a734499d3756a13312a89b7da5903f6
Applying the patch

To apply the patch in an airgapped environment:
1. Download the image by running the following command. You need to provide the values for "<folder path>/auth.json" and "<local private registry>".
skopeo copy --all --authfile "<folder path>/auth.json" --dest-tls-verify=false --src-tls-verify=false docker://cp.icr.io/cp/cpd/spark-hb-helm-repo@sha256:8a7989ff037834f7d985a2f95806d5f64a734499d3756a13312a89b7da5903f6 <local private registry>/cp/cpd/spark-hb-helm-repo@sha256:8a7989ff037834f7d985a2f95806d5f64a734499d3756a13312a89b7da5903f6
or:
skopeo copy --all --authfile "<folder path>/auth.json" --dest-tls-verify=false --src-tls-verify=false docker://cp.icr.io/cp/cpd/spark-hb-helm-repo:mss-patch-2 <local private registry>/cp/cpd/spark-hb-helm-repo:mss-patch-2

Prepare the authentication credentials to access the IBM production repository. Use the same auth.json file used for CASE download and image mirroring. An example directory path:
${PROJECT_CPD_INSTANCE}/.airgap/auth.json
Or create an auth.json file that contains credentials to access cp.icr.io and your local private registry. For example:
{
  "auths": {
    "cp.icr.io":{"email":"unused","auth":"<base64 encoded id:apikey>"},
    "<private registry hostname>":{"email":"unused","auth":"<base64 encoded id:password>"}
   }
 }
2. Run the following command to apply the hotfix image:
oc patch AnalyticsEngine analyticsengine-sample --namespace $PROJECT_CPD_INSTANCE --type merge --patch '{"spec": {"image_digests": {"spark-hb-helm-repo":"sha256:8a7989ff037834f7d985a2f95806d5f64a734499d3756a13312a89b7da5903f6"}}}'

What to do next

Make sure to revert the image overrides before you install or upgrade to a newer refresh or a major release of IBM® Cloud Pak for Data.

To revert the image overrides, run the following command to edit the Analytics Engine custom resource (ae):

oc patch AnalyticsEngine analyticsengine-sample --namespace ${PROJECT_CPD_INSTANCE} --type=json --patch '[{ "op": "remove", "path": "/spec/image_digests"}]'
 
 
After approximately 5-6 minutes, the spark-hb-control-plane pod should be up and running with the patched image.
Cloud Pak for Data 4.5.0 patches
Ensure that you apply patches for the version of Cloud Pak for Data that is running on your environment:
 
4.5.1 patches
cpd-spark-4-5-1-patch-1
Patch name cpd-spark-4.5.1-patch-1
Released on 21 November 2022
Service assembly spark
Applies to service version Analytics Engine for Apache Spark 4.5.1
Applies to platform version Cloud Pak for Data 4.5.1
Patch type
Image patch (Defect fix)
Description
The patch includes a fix for:
  • Issues faced while submitting jobs that return errors on invalid state transition.
Important
  • The AnayticsEngine CR takes a few minutes to move to "Completed" state. Only submit jobs or kernels after the CR has moved to "Completed" state.
Validating the patch
You can check after the patch was applied to determine if the job state transition is correct by running the following commands:
1. Get the AnalyticsEngine CR:
oc get ae -n $PROJECT_CPD_INSTANCE
 
2. After the patch is applied, the AnalyticsEngine CR will move to the "InProgress" state. Then wait for the CR to move to the "Completed" state.
3. When the CR moves to "Completed" state, verify the control-plane pod digest by running the following command:

oc get deploy spark-hb-control-plane -n zen -o jsonpath='{.spec.template.spec.containers[0].image}'
Instructions
Applying the patch

To apply the patch:
1. For airgapped environments only: Download the image by running the following commands:
skopeo copy --all --authfile "<folder path>/auth.json" --dest-tls-verify=false --src-tls-verify=false docker://cp.icr.io/cp/cpd/spark-hb-control-plane@sha256:47357e8c226bd4971cdc60e34babb645214d9d2a8dca45f9df61a9b1d385c1ac <local private registry>/cp/cpd/spark-hb-control-plane@sha256:47357e8c226bd4971cdc60e34babb645214d9d2a8dca45f9df61a9b1d385c1ac
skopeo copy --all --authfile "<folder path>/auth.json" --dest-tls-verify=false --src-tls-verify=false docker://cp.icr.io/cp/cpd/spark-hb-helm-repo@sha256:68ed95bff20e328b6a7faf749ca8cebd4cad3a2072a81ddc039d6bc0e14abff1 <local private registry>/cp/cpd/spark-hb-helm-repo@sha256:68ed95bff20e328b6a7faf749ca8cebd4cad3a2072a81ddc039d6bc0e14abff1
2. For airgapped environments only: Prepare the authentication credentials to access the IBM production repository. Use the same auth.json file used for CASE download and image mirroring. An example directory path:
${PROJECT_CPD_INSTANCE}/.airgap/auth.json
Or create an auth.json file that contains credentials to access cp.icr.io and your local private registry. For example:
{
  "auths": {
    "cp.icr.io":{"email":"unused","auth":"<base64 encoded id:apikey>"},
    "<private registry hostname>":{"email":"unused","auth":"<base64 encoded id:password>"}
   }
 }
3. Run the following command to apply the hotfix image:
 oc patch AnalyticsEngine analyticsengine-sample --namespace $PROJECT_CPD_INSTANCE --type merge --patch '{"spec": {"image_digests":{"spark-hb-control-plane":"sha256:47357e8c226bd4971cdc60e34babb645214d9d2a8dca45f9df61a9b1d385c1ac","spark-hb-helm-repo":"sha256:68ed95bff20e328b6a7faf749ca8cebd4cad3a2072a81ddc039d6bc0e14abff1"}}}'

What to do next

Make sure to revert the image overrides before you install or upgrade to a newer refresh or a major release of IBM® Cloud Pak for Data.

To revert the image overrides, run the following command to edit the Analytics Engine custom resource (ae):

oc patch AnalyticsEngine analyticsengine-sample --namespace ${PROJECT_CPD_INSTANCE} --type=json --patch '[{ "op": "remove", "path": "/spec/image_digests"}]'
 
 
After approximately 5-6 minutes, the spark-hb-control-plane pod should be up and running with the patched image.
4.5.2 patches
cpd-spark-4-5-2-patch-1
Patch name cpd-spark-4.5.2-patch-1
Released on 14 October 2022
Service assembly spark
Applies to service version Analytics Engine for Apache Spark 4.5.2
Applies to platform version Cloud Pak for Data 4.5.2
Patch type
Image patch (Defect fix)
Description
The patch includes a fix for:
  • Issues faced while submitting jobs that return error on invalid state transition.
Important
  • The AnalyticsEngine CR will take a few minutes to move to "Completed" state. Only submit jobs or kernels after the CR has moved to "Completed" state.
Validating the patch
You can check after the patch was applied to determine if the job state transition is correct by running the following command.
1. After the patch is applied, the AnalyticsEngine CR will move to the "InProgress" state. Then wait for the CR to move to the "Completed" state.
2. When the CR moves to "Completed" state, verify the control-plane pod digest by running the following command:

oc get deploy spark-hb-control-plane -n zen -o jsonpath='{.spec.template.spec.containers[0].image}'
Instructions
Applying the patch

To apply the patch in an airgapped environment:
1. Download the image by running the following command:
skopeo copy --all --authfile "<folder path>/auth.json" --dest-tls-verify=false --src-tls-verify=false docker://cp.icr.io/cp/cpd/spark-hb-control-plane@sha256:2d0d569aaa192e1345503d51560fbaaa00793a5a671cecddc77f79276e5ca209 <local private registry>/cp/cpd/spark-hb-control-plane@sha256:2d0d569aaa192e1345503d51560fbaaa00793a5a671cecddc77f79276e5ca209
Prepare the authentication credentials to access the IBM production repository. Use the same auth.json file used for CASE download and image mirroring. An example directory path:
${PROJECT_CPD_INSTANCE}/.airgap/auth.json
Or create an auth.json file that contains credentials to access cp.icr.io and your local private registry. For example:
{
  "auths": {
    "cp.icr.io":{"email":"unused","auth":"<base64 encoded id:apikey>"},
    "<private registry hostname>":{"email":"unused","auth":"<base64 encoded id:password>"}
   }
 }
2. Run the following command to apply the hotfix image:
 oc patch AnalyticsEngine analyticsengine-sample --namespace  $PROJECT_CPD_INSTANCE --type merge --patch '{"spec": {"image_digests":{"spark-hb-control-plane":"sha256:2d0d569aaa192e1345503d51560fbaaa00793a5a671cecddc77f79276e5ca209"}}}'

What to do next

Make sure to revert the image overrides before you install or upgrade to a newer refresh or a major release of IBM® Cloud Pak for Data.

To revert the image overrides, run the following command to edit the Analytics Engine custom resource (ae):

oc patch AnalyticsEngine analyticsengine-sample --namespace ${PROJECT_CPD_INSTANCE} --type=json --patch '[{ "op": "remove", "path": "/spec/image_digests"}]'
 
 
After approximately 5-6 minutes, the spark-hb-control-plane pod should be up and running with the patched image.
4.5.3 patches
cpd-spark-4-5-3-patch-1
Patch name cpd-spark-4.5.3-patch-1
Released on 1 February 2023
Service assembly spark
Applies to service version Analytics Engine for Apache Spark 4.5.3
Applies to platform version Cloud Pak for Data 4.5.3
Patch type
Image patch (Defect fix)
Description
The patch includes a fix for:
  • Allowing users to add customizations to the Spark environment, like custom volume mount support and idle kernel timeout in Watson Studio notebooks.
Important
If you are using the Spark kernel offering, you must define a storage volume named "spark-shuffle-storage". This storage volume must be available in the cluster to avoid issues when running kernels.
Validating the patch
You can check if the patch was applied correctly by running the following commands:
  1. Check that the AnalyticsEngine CR status is in "Completed" state by running the following command:
    oc get AnalyticsEngine -o yaml -n ${PROJECT_CPD_INSTANCE}
  2. Verify that the correct digest was deployed in the cluster using the following command:
    oc get deploy/spark-hb-deployer-agent -o=jsonpath={.spec.template.spec.containers..image}

    The output of the above command should contain: spark-hb-helm-repo@sha256:42daa13e64d106e6b3ace5750fb330c35ffcb4aa6799a4bb7be2bdd2160a5939
Applying the patch

To apply the patch in an airgapped environment:
1. Download the image by running the following command. You need to provide the values for "<folder path>/auth.json" and "<local private registry>".
skopeo copy --all --authfile "<folder path>/auth.json" --dest-tls-verify=false --src-tls-verify=false docker://cp.icr.io/cp/cpd/spark-hb-helm-repo@sha256:42daa13e64d106e6b3ace5750fb330c35ffcb4aa6799a4bb7be2bdd2160a5939 <local private registry>/cp/cpd/spark-hb-helm-repo@sha256:42daa13e64d106e6b3ace5750fb330c35ffcb4aa6799a4bb7be2bdd2160a5939
Prepare the authentication credentials to access the IBM production repository. Use the same auth.json file used for CASE download and image mirroring. An example directory path:
${PROJECT_CPD_INSTANCE}/.airgap/auth.json
Or create an auth.json file that contains credentials to access cp.icr.io and your local private registry. For example:
{
  "auths": {
    "cp.icr.io":{"email":"unused","auth":"<base64 encoded id:apikey>"},
    "<private registry hostname>":{"email":"unused","auth":"<base64 encoded id:password>"}
   }
 }
2. Run the following command to apply the hotfix image:
oc patch AnalyticsEngine analyticsengine-sample --namespace $PROJECT_CPD_INSTANCE --type merge --patch '{"spec": {"image_digests":{"spark-hb-helm-repo":"sha256:42daa13e64d106e6b3ace5750fb330c35ffcb4aa6799a4bb7be2bdd2160a5939"}}}'

What to do next

Make sure to revert the image overrides before you install or upgrade to a newer refresh or a major release of IBM® Cloud Pak for Data.

To revert the image overrides, run the following command to edit the Analytics Engine custom resource (ae):

oc patch AnalyticsEngine analyticsengine-sample --namespace ${PROJECT_CPD_INSTANCE} --type=json --patch '[{ "op": "remove", "path": "/spec/image_digests"}]'
 
 
After approximately 5-6 minutes, the spark-hb-control-plane pod should be up and running with the patched image.
cpd-spark-4-5-3-patch-2
Patch name cpd-spark-4.5.3-patch-2
Released on 21 February 2023
Service assembly spark
Applies to service version Analytics Engine for Apache Spark 4.5.3
Applies to platform version Cloud Pak for Data 4.5.3
Patch type
Image patch (Defect fix)
Description
The patch includes a fix for:
  • Importing metadata using the JDBC Connector that was failing if there were stale views on the database side. This happened even when the Ignore Access Errors option was selected during the import.
Validating the patch
You can check if the patch was applied correctly by running the following commands:
  1. Check that the AnalyticsEngine CR status is in "Completed" state by running the following command:
    oc get AnalyticsEngine -o yaml -n ${PROJECT_CPD_INSTANCE}
  2. Verify that the correct digests were deployed in the cluster using the following commands:
    oc get deploy/spark-hb-control-plane -n ${PROJECT_CPD_INSTANCE}  -o=jsonpath={.spec.template.spec.containers..image}

    The output of the above command should contain: spark-hb-control-plane:sha256:2a4715b3dd74be70ed02eb06de208d42b187d62cb1aaa03d69d99616cfe62985

    oc get cronjob/spark-hb-preload-jkg-image -n ${PROJECT_CPD_INSTANCE} -o=jsonpath={.spec.jobTemplate.spec.template.spec.containers..image}

    The output of the above command should contain: spark-hb-jkg@sha256:8ab9e3c4c6a19c9158ed030970a8ce051e0f3a972dca69c1fb9ef5d3d93a55e7
  
Applying the patch

To apply the patch in an airgapped environment:
1. Download the image by running the following command. You need to provide the values for "<folder path>/auth.json" and "<local private registry>".
skopeo copy --all --authfile "<folder path>/auth.json" --dest-tls-verify=false --src-tls-verify=false docker://cp.icr.io/cp/cpd/spark-hb-control-plane@sha256:2a4715b3dd74be70ed02eb06de208d42b187d62cb1aaa03d69d99616cfe62985<local private registry>/cp/cpd/spark-hb-control-plane@sha256:2a4715b3dd74be70ed02eb06de208d42b187d62cb1aaa03d69d99616cfe62985
skopeo copy --all --authfile "<folder path>/auth.json" --dest-tls-verify=false --src-tls-verify=false docker://cp.icr.io/cp/cpd/spark-hb-jkg@sha256:8ab9e3c4c6a19c9158ed030970a8ce051e0f3a972dca69c1fb9ef5d3d93a55e7 <local private registry>/cp/cpd/spark-hb-jkg@sha256:8ab9e3c4c6a19c9158ed030970a8ce051e0f3a972dca69c1fb9ef5d3d93a55e7
Prepare the authentication credentials to access the IBM production repository. Use the same auth.json file used for CASE download and image mirroring. An example directory path:
${PROJECT_CPD_INSTANCE}/.airgap/auth.json
Or create an auth.json file that contains credentials to access cp.icr.io and your local private registry. For example:
{
  "auths": {
    "cp.icr.io":{"email":"unused","auth":"<base64 encoded id:apikey>"},
    "<private registry hostname>":{"email":"unused","auth":"<base64 encoded id:password>"}
   }
 }
2. Run the following command to apply the hotfix image:
oc patch AnalyticsEngine analyticsengine-sample --namespace $PROJECT_CPD_INSTANCE --type merge --patch '{"spec": {"image_digests": {"spark-hb-control-plane":"sha256:2a4715b3dd74be70ed02eb06de208d42b187d62cb1aaa03d69d99616cfe62985","spark-hb-jkg-v32": "sha256:8ab9e3c4c6a19c9158ed030970a8ce051e0f3a972dca69c1fb9ef5d3d93a55e7"}}}'

What to do next

Make sure to revert the image overrides before you install or upgrade to a newer refresh or a major release of IBM® Cloud Pak for Data.

To revert the image overrides, run the following command to edit the Analytics Engine custom resource (ae):

oc patch AnalyticsEngine analyticsengine-sample --namespace ${PROJECT_CPD_INSTANCE} --type=json --patch '[{ "op": "remove", "path": "/spec/image_digests"}]'
 
 
After approximately 5-6 minutes, the spark-hb-control-plane pod should be up and running with the patched image.
cpd-spark-4-5-3-patch-3
Patch name cpd-spark-4.5.3-patch-3
Released on 02 March 2023
Service assembly spark
Applies to service version Analytics Engine for Apache Spark 4.5.3
Applies to platform version Cloud Pak for Data 4.5.3
Patch type
Image patch (Defect fix)
Description
The patch includes a fix for:
  • For the "context deadline" error which occurs when creating the Spark container due to the large number of files in file-api-pv. The error is corrected by disabling SELinux Relabelling on all Spark jobs and kernels.
Validating the patch
You can check if the patch was applied correctly by running the following commands:
  1. Get the AnalyticsEngine CR by running the following command:
    oc get ae -n $PROJECT_CPD_INSTANCE
  2. After the patch is applied, the AnalyticsEngine CR will move to the "InProgress" state. Then wait for the CR to move to the "Completed" state.
  3. When the CR moves to "Completed" state, check if the spark-hb-helm-repo pod is running with the updated image by running the following command:
    oc get pod | grep spark-hb-helm
  4. Then describe one of the pods from the output and check the image tag:
    oc get pod <podname> -o jsonpath="{..image}" | tr -s '[[:space:]]' '\n' | uniq

    The image tag should match the patched image tag (56ce9ebf7abd03744db2ea74ef1c80119e51ee5a0b167a3648544baba95558e8) that is included in the patch logs in the list of images required section.
  
Applying the patch

To apply the patch:
1. Set up MachineConfig for the SELinux handler, by saving the following contents to a file, for example machineconfig.yaml:
  
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"machineconfiguration.openshift.io/v1","kind":"MachineConfig","metadata":{"annotations":{},"labels":{"machineconfiguration.openshift.io/role":"worker"},"name":"99-worker-selinux-configuration"},"spec":{"config":{"ignition":{"version":"3.2.0"},"storage":{"files":[{"contents":{"source":"data:text/plain;charset=utf-8;base64,W2NyaW8ucnVudGltZS5ydW50aW1lcy5zZWxpbnV4XQpydW50aW1lX3BhdGggPSAiL3Vzci9iaW4vcnVuYyIKcnVudGltZV9yb290ID0gIi9ydW4vcnVuYyIKcnVudGltZV90eXBlID0gIm9jaSIKYWxsb3dlZF9hbm5vdGF0aW9ucyA9IFsiaW8ua3ViZXJuZXRlcy5jcmktby5UcnlTa2lwVm9sdW1lU0VMaW51eExhYmVsIl0K"},"mode":416,"overwrite":true,"path":"/etc/crio/crio.conf.d/01-selinux.conf"}]}},"osImageURL":""}}
  generation: 1
  labels:
    machineconfiguration.openshift.io/role: worker
  name: 99-worker-selinux-configuration
spec:
  config:
    ignition:
      version: 3.2.0
    storage:
      files:
      - contents:
          source: data:text/plain;charset=utf-8;base64,W2NyaW8ucnVudGltZS5ydW50aW1lcy5zZWxpbnV4XQpydW50aW1lX3BhdGggPSAiL3Vzci9iaW4vcnVuYyIKcnVudGltZV9yb290ID0gIi9ydW4vcnVuYyIKcnVudGltZV90eXBlID0gIm9jaSIKYWxsb3dlZF9hbm5vdGF0aW9ucyA9IFsiaW8ua3ViZXJuZXRlcy5jcmktby5UcnlTa2lwVm9sdW1lU0VMaW51eExhYmVsIl0K
        mode: 416
        overwrite: true
        path: /etc/crio/crio.conf.d/01-selinux.conf
  osImageURL: ""
2. Run the following command to deploy this MachineConfig:
oc apply -f machineconfig.yaml
3.  To deploy the SELinux runtime class, create a file called "runtimeclass.yaml" with the following contents:
 
apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
  name: selinux
handler: selinux
4. Run the following command to deploy this runtime class:
 
oc apply -f runtimeclass.yaml

Note: After these steps, the operators and pods should restart in the cluster which will take some time. Wait for the cluster to become idle again before continuing with the next steps.
5. If the cluster is air-gapped: Download the image by running the following command. You need to provide the values for "<folder path>/auth.json".
skopeo copy --all --authfile "<folder path>/auth.json" --dest-tls-verify=false --src-tls-verify=false docker://cp.icr.io/cp/cpd/spark-hb-helm-repo@sha256:42daa13e64d106e6b3ace5750fb330c35ffcb4aa6799a4bb7be2bdd2160a5939 <local private registry>/cp/cpd/spark-hb-helm-repo@sha256: 56ce9ebf7abd03744db2ea74ef1c80119e51ee5a0b167a3648544baba95558e8
Or create an auth.json file that contains credentials to access cp.icr.io and your local private registry. For example:
{
  "auths": {
    "cp.icr.io":{"email":"unused","auth":"<base64 encoded id:apikey>"},
    "<private registry hostname>":{"email":"unused","auth":"<base64 encoded id:password>"}
   }
 }
6. Run the following command to apply the hotfix image:
oc patch AnalyticsEngine analyticsengine-sample --namespace $PROJECT_CPD_INSTANCE --type merge --patch '{"spec": {"image_digests":{"spark-hb-helm-repo":"sha256: 56ce9ebf7abd03744db2ea74ef1c80119e51ee5a0b167a3648544baba95558e8"}}}'

What to do next

Make sure to revert the image overrides before you install or upgrade to a newer refresh or a major release of IBM® Cloud Pak for Data.

To revert the image overrides, run the following command to edit the Analytics Engine custom resource (ae):

oc patch AnalyticsEngine analyticsengine-sample --namespace ${PROJECT_CPD_INSTANCE} --type=json --patch '[{ "op": "remove", "path": "/spec/image_digests"}]'
 
 
After approximately 5-6 minutes, the spark-hb-control-plane pod should be up and running with the patched image.
cpd-spark-4-5-3-patch-4
Patch name cpd-spark-4.5.3-patch-4
Released on 22 March 2023
Service assembly spark
Applies to service version Analytics Engine for Apache Spark 4.5.3
Applies to platform version Cloud Pak for Data 4.5.3
Patch type
Image patch (Defect fix)
Description
The patch includes a fix for:
  • Teradata character set errors in Data Refinery.
Validating the patch
You can check if the patch was applied correctly by running the following commands:
  1. Check that the AnalyticsEngine CR status is in "Completed" state by running the following command:

    oc get AnalyticsEngine -o yaml -n ${PROJECT_CPD_INSTANCE}
  2. Verify that both digests were deployed correctly in the cluster using the following commands:

    oc get deploy/spark-hb-control-plane -n ${PROJECT_CPD_INSTANCE}  -o=jsonpath={.spec.template.spec.containers..image}

    The output of the first command should contain: spark-hb-control-plane:sha256:a3df6e607e00864e26ad9e80f48e8334018b441059396f7c68273886132f6554

    oc get cronjob/spark-hb-preload-jkg-image -n ${PROJECT_CPD_INSTANCE} -o=jsonpath={.spec.jobTemplate.spec.template.spec.containers..image}

    The output of the second command should contain: spark-hb-jkg@sha256:252baf42af9189e2ffdfe7bf5628b3421d7ddacc588fef8de68e1f7df23ba77a
  
Applying the patch

To apply the patch in an airgapped environment:
1. Download the image by running the following command. You need to provide the values for "<folder path>/auth.json" and "<local private registry>".
skopeo copy --all --authfile "<folder path>/auth.json" --dest-tls-verify=false --src-tls-verify=false docker://cp.icr.io/cp/cpd/spark-hb-control-plane@sha256:a3df6e607e00864e26ad9e80f48e8334018b441059396f7c68273886132f6554 <local private registry>/cp/cpd/spark-hb-control-plane@sha256:a3df6e607e00864e26ad9e80f48e8334018b441059396f7c68273886132f6554
skopeo copy --all --authfile "<folder path>/auth.json" --dest-tls-verify=false --src-tls-verify=false docker://cp.icr.io/cp/cpd/spark-hb-jkg@sha256:252baf42af9189e2ffdfe7bf5628b3421d7ddacc588fef8de68e1f7df23ba77a <local private registry>/cp/cpd/spark-hb-jkg@sha256:252baf42af9189e2ffdfe7bf5628b3421d7ddacc588fef8de68e1f7df23ba77a
Prepare the authentication credentials to access the IBM production repository. Use the same auth.json file used for CASE download and image mirroring. An example directory path:
${PROJECT_CPD_INSTANCE}/.airgap/auth.json
Or create an auth.json file that contains credentials to access cp.icr.io and your local private registry. For example:
{
  "auths": {
    "cp.icr.io":{"email":"unused","auth":"<base64 encoded id:apikey>"},
    "<private registry hostname>":{"email":"unused","auth":"<base64 encoded id:password>"}
   }
 }
2. Run the following command to apply the hotfix image:
oc patch AnalyticsEngine analyticsengine-sample --namespace $PROJECT_CPD_INSTANCE --type merge --patch '{"spec": {"image_digests": {"spark-hb-control-plane":"sha256:a3df6e607e00864e26ad9e80f48e8334018b441059396f7c68273886132f6554","spark-hb-jkg-v32": "sha256:252baf42af9189e2ffdfe7bf5628b3421d7ddacc588fef8de68e1f7df23ba77a"}}}'

What to do next

Make sure to revert the image overrides before you install or upgrade to a newer refresh or a major release of IBM® Cloud Pak for Data.

To revert the image overrides, run the following command to edit the Analytics Engine custom resource (ae):

oc patch AnalyticsEngine analyticsengine-sample --namespace ${PROJECT_CPD_INSTANCE} --type=json --patch '[{ "op": "remove", "path": "/spec/image_digests"}]'
 
 
After approximately 5-6 minutes, the spark-hb-control-plane pod should be up and running with the patched image.
 
Cloud Pak for Data 3.5.0 patches
Ensure that you apply patches for the version of Cloud Pak for Data that is running on your environment:
 
3.5.3 patches
cpd-3.5.3-spark-patch-3
Patch name cpd-3.5.3-spark-patch-3
Released on 14 June 2021
Service assembly spark
Applies to service version Analytics Engine for Apache Spark 3.5.3
Applies to platform version Cloud Pak for Data 3.5.3
Patch type
Cumulative
Description
The patch includes the following fix:
  • Updates the sparklyr RStudio package to the 1.7.0 version.
Important
Validating the patch
You can check after the patch application to determine whether the spark-hb-preload-jkg-image cronjob was triggered or not by running the following command and check for LAST SCHEDULE.
oc get cronjob | grep spark-hb-preload-jkg-image
If the cronjob was triggered, running the following command and confirm the output:
oc get pods | grep spark-hb-preload-jkg-image  
You can describe one of the pods from the output and check the image tag by running the following command:
oc get pod <podname> -o jsonpath="{..image}" | tr -s '[[:space:]]' '\n' | uniq
The image tag should match with Patched image tag (2.4.7.v12.9-3.5.3 and 3.0.2.v1.9-3.5.3) that is included in the Patch logs in the List of images required section.
After validating the output, submit a Spark job or Kernel request and describe the jkg-deployment or spark-master pod output, and then check the image tag. The image tag must match the cronjob pulled image tag and Patched image tag.
Instructions
cpd-3.5.3-spark-patch-2
Patch name cpd-3.5.3-spark-patch-2
Released on 4 June 2021
Service assembly spark
Applies to service version Analytics Engine for Apache Spark 3.5.3
Applies to platform version Cloud Pak for Data 3.5.3
Patch type
Cumulative
Description
The patch includes the following fix:
  • Updates the sparklyr RStudio package to the 1.6.3 version.
Important
Validating the patch
You can check after the patch application to determine whether the spark-hb-preload-jkg-image cronjob was triggered or not by running the following command and check for LAST SCHEDULE.
oc get cronjob | grep spark-hb-preload-jkg-image
If the cronjob was triggered, running the following command and confirm the output:
oc get pods | grep spark-hb-preload-jkg-image  
You can describe one of the pods from the output and check the image tag by running the following command:
oc get pod <podname> -o jsonpath="{..image}" | tr -s '[[:space:]]' '\n' | uniq
The image tag should match with Patched image tag (2.4.7.v12.8-3.5.3 and 3.0.2.v1.8-3.5.3) that is included in the Patch logs in the List of images required section.
After validating the output, submit a Spark job or Kernel request and describe the jkg-deployment or spark-master pod output, and then check the image tag. The image tag must match the cronjob pulled image tag and Patched image tag.
Instructions
cpd-3.5.3-spark-patch-1
Patch name cpd-3.5.3-spark-patch-1
Released on 21 May 2021
Service assembly spark
Applies to service version Analytics Engine for Apache Spark 3.5.3
Applies to platform version Cloud Pak for Data 3.5.3
Patch type
Cumulative
Description
The patch includes the following fix:
  • Updated RStudio sparklyr library.
Important:
  • After you apply the patch, it might take two hours for the newer Spark 2.4.7 and Spark 3.0.2 images to be pulled on the worker nodes.
  • The cpd-3.5.3-rstudio-patch-1 and cpd-3.5.3-spark-patch-1 are corequisite patches and must both be installed.
Validating the patch
You can check after the patch application to determine whether the spark-hb-preload-jkg-image cronjob was triggered or not by running the following command and check for LAST SCHEDULE.
oc get cronjob | grep spark-hb-preload-jkg-image
If the cronjob was triggered, running the following command and confirm the output:
oc get pods | grep spark-hb-preload-jkg-image  
You can describe one of the pods from the output and check the image tag by running the following command:
oc get pod <podname> -o jsonpath="{..image}" | tr -s '[[:space:]]' '\n' | uniq
The image tag should match with Patched image tag (2.4.7.v12.7-3.5.3 and 3.0.2.v1.7-3.5.3) that is included in the Patch logs in the List of images required section.
After validating the output, submit a Spark job or Kernel request and describe the jkg-deployment or spark-master pod output, and then check the image tag. The image tag must match the cronjob pulled image tag and Patched image tag.
Instructions
3.5.8 patches
cpd-3.5.8-spark-patch-1
Patch name cpd-3.5.8-spark-patch-1
Released on 16 February 2023
Service assembly spark
Applies to service version Analytics Engine for Apache Spark 3.5.8
Applies to platform version Cloud Pak for Data 3.5.8
Patch type
Cumulative
Description
The patch includes the following fix:
  • Restarts the Spark pod after upgrading Openshift to version 4.8.
 
Validating the patch
After you have applied the patch, check if the spark-hb-helm-repo pod was restarted by running the following command:
oc get pod | grep spark-hb-helm
Then describe one of the pods from the output and check the image tag:
oc get pod <podname> -o jsonpath="{..image}" | tr -s '[[:space:]]' '\n' | uniq
The image tag should match the patched image tag (3.5.8001.3-amd64) that is included in the patch logs in the list of images required section.
Cloud Pak for Data 3.0.1 patches
cpd-3.0.1-spark-patch-5
Patch name cpd-3.0.1-spark-patch-5
Released on 14 January 2020
Service assembly spark
Applies to service version Analytics Engine for Apache Spark 3.0.1
Applies to platform version Cloud Pak for Data 3.0.1
Patch type
Cumulative
Description
The patch includes the following fixes:
  •  Fix for Data Refinery Jobs failing with error "Error getting service instance id using HB instance id."
Instructions
Important: After you apply the patch, it might take two hours for the newer Spark 2.4.7 images to be pulled on the worker nodes.
Once the images are pulled, stop and start the history server so that it can use the newer Spark 2.4.7 images.
cpd-3.0.1-spark-patch-4
Patch name cpd-3.0.1-spark-patch-4
Released on 14 December 2020
Service assembly spark
Applies to service version Analytics Engine for Apache Spark 3.0.1
Applies to platform version Cloud Pak for Data 3.0.1
Patch type
Cumulative
Description
The patch includes the following fixes:
  • Updated project-lib for Python and R to 2.0.0.
  • Security fixes.
  • Support for Python 3.7.
  • Upgraded Tensorflow and its dependencies.
Instructions
Important: After you apply the patch, it might take two hours for the newer Spark 2.4.6 images to be pulled on the worker nodes.
Once the images are pulled, stop and start the history server so that it can use the newer Spark 2.4.6 images.
cpd-3.0.1-spark-patch-3
Patch name cpd-3.0.1-spark-patch-3
Released on 7 September 2020
Service assembly spark
Applies to service version Analytics Engine for Apache Spark 3.0.1
Applies to platform version Cloud Pak for Data 3.0.1
Patch type
Cumulative
Description
This patch fix supports OpenShift versions 3.11, 4.3 and 4.5.
The patch includes the following fixes:
  • Platform tokens are now supported for job API.
  • Fixes History server that redirected to the wrong URL.
Instructions
Important: After you apply the patch, it might take two hours for the newer Spark 2.4.6 images to be pulled on the worker nodes.
Once the images are pulled, stop and start the history server so that it can use the newer Spark 2.4.6 images.
cpd-3.0.1-spark-patch-2
Patch name cpd-3.0.1-spark-patch-2
Released on 14 August 2020
Service assembly spark
Applies to service version Analytics Engine for Apache Spark 3.0.1
Applies to platform version Cloud Pak for Data 3.0.1
Patch type
Additive
Prerequisite:
Description
This patch supports both the x86 and Power platforms.
The patch includes the following enhancements:
  • Access to a Spark instance using the platform token is now available.
  • New flag CHECK_WORKER_EVENTS_ENABLED to disable check for Kubernetes events while a job is submitted. By default, the flag is set to true. If you're in an environment where PVCs are not mounted in the first few tries, set this flag to false.

    Note: If the CHECK_WORKER_EVENTS_ENABLED flag is set to false and there are not enough resources in the cluster for spark-master/jkg pods, then the following error will occur:

    curl: (52) Empty reply from server.
     
The patch includes the following fixes to the Spark history server:
  • Clicking on the top left Spark icon on the History server now navigates to the Spark history home.
  • Intermittently clicking on the Spark application no longer leads to time-outs due to buggy re-writes.
Instructions
Before applying the patch, perform the following tasks:
  1. Delete the existing spark-hb-nginx-configmap and hummingbird-route configmaps using the following commands:
    oc delete cm spark-hb-nginx-configmap
    oc delete cm hummingbird-route
  2. Download the following yaml files:
  3. After you download the files, replace the OC_PROJECT_NAMESPACE and NGINX_RESOLVER values in the yaml files in the following commands.  After you replace the values, run the commands separately on a cluster, where you'll be applying the patch.
    sed -i -e 's/OC_PROJECT_NAMESPACE/<replace with oc 
    namespace where spark installed ex. zen>/g' -e 's/
    NGINX_RESOLVER/<replace with nginx resolver value 
    mentioned above>/g' spark-hb-nginx-configmap.yaml
    sed -i -e 's/OC_PROJECT_NAMESPACE/<replace with oc 
    namespace where spark installed ex. zen>/g' 
    hummingbird-route.yaml

    Important: For the OC_PROJECT_NAMESPACE value, replace it with the oc namespace where Spark is installed.

    For the NGINX_RESOLVER value, use the appropriate value based on the OpenShift Container platform version:

    • OCP 4.3 Nginx resolver value: dns-default.openshift-dns
    • OCP 3.11 Nginx resolver value: kubernetes.default
  4. Create the configmaps using the following commands:
    oc create -f spark-hb-nginx-configmap.yaml
    oc create -f hummingbird-route.yaml
  5. Use the following commands to confirm that the configmaps were created successfully:
    oc get cm | grep spark-hb-nginx-configmap
    oc get cm | grep hummingbird-route
  6. See Applying patches for additional instructions.
cpd-3.0.1-spark-patch-1
Patch name cpd-3.0.1-spark-patch-1
Released on 26 June 2020
Service assembly spark
Applies to service version Analytics Engine for Apache Spark 3.0.1
Applies to platform version Cloud Pak for Data 3.0.1
Patch type
Cumulative
Description
The patch includes the following fixes:
  • Remove Spark 2.3 and upgrade the Spark version from Spark 2.4.4 to Spark 2.4.6 to resolve the jackson-databind vulnerability issue.
  • install.packages() function in a R spark environment uses the right repository as default. 
  • OpenJDK security vulnerabilities.
  • Data Refinery sending correct account ID for multiple users using same project.

Instructions
Important: After you apply the patch, it might take two hours for the newer Spark 2.4.6 images to be pulled on the worker nodes.
Once the images are pulled, stop and start the history server so that it can use the newer Spark 2.4.6 images.
Cloud Pak for Data 2.5.0 patches
cpd-2.5.0.0-spark-patch-3
Patch name cpd-2.5.0.0-spark-patch-3
Released on nn February 2021
Service assembly spark
Applies to service version Analytics Engine for Apache Spark 2.5.0
Applies to platform version Cloud Pak for Data 2.5
Patch type
Cumulative
Prerequisites:
Description
The patch includes the following fixes:
  • Security vulnerability issues are fixed.
  • Added Python 3.7 support for Spark.

Instructions See Applying patches.
cpd-2.5.0.0-spark-patch-2
Patch name cpd-2.5.0.0-spark-patch-2
Released on 23 June 2020
Service assembly spark
Applies to service version Analytics Engine for Apache Spark 2.5.0
Applies to platform version Cloud Pak for Data 2.5
Patch type
Cumulative
Prerequisites:
Description
The patch includes the following fixes:
  • Using connected data from some connectors no longer cause errors on a job run.
Instructions See Applying patches.
cpd-2.5.0.0-spark-patch-1
Patch name cpd-2.5.0.0-spark-patch-1
Released on 10 June 2020
Service assembly spark
Applies to service version Analytics Engine for Apache Spark 2.5.0
Applies to platform version Cloud Pak for Data 2.5
Patch type
Cumulative
Prerequisites:
Description
The patch includes the following fixes:
  • Data Refinery jobs run correctly with the Refinery operations that were applied to the result file and Japanese data is displayed correctly.
  • Data Refinery jobs with an output to Oracle DB execute on Default Spark 2.4 and R 3.6.
Instructions See Applying patches.

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSHGYS","label":"IBM Cloud Pak for Data"},"ARM Category":[{"code":"a8m0z000000GpDZAA0","label":"Services and Integrations"}],"ARM Case Number":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Version(s)","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
22 March 2023

UID

ibm15693756