IBM Cloud Pak® for Data Version 4.6 will reach end of support (EOS) on 31 July, 2025. For more information, see the Discontinuance of service announcement for IBM Cloud Pak for Data Version 4.X.
Upgrade to IBM Software Hub Version 5.1 before IBM Cloud Pak for Data Version 4.6 reaches end of support. For more information, see Upgrading IBM Software Hub in the IBM Software Hub Version 5.1 documentation.
Adding self-signed certificates in Analytics Engine Powered by Apache Spark
You can add your own self-signed certificates or CA certificates that are owned by your organization to the Spark truststore. You add certificates to securely connect between the Spark runtime and your resources, like the web server, IBM Cloud Object Storage and any databases.
You must be project administrator to add self-signed certificates to the Spark truststore.
To add self-signed certificates:
-
Fetch the internal certificate. You can run the following commands to copy the internal certificate to a local file:
oc get secret internal-tls -n ${PROJECT_CPD_INSTANCE} -o jsonpath='{.data.ca\.crt}' | base64 -d > ca.crt oc get secret internal-tls -n ${PROJECT_CPD_INSTANCE} -o jsonpath='{.data.tls\.crt}' | base64 -d > tls.crt -
Append the certificates that you want to include to the local file
ca.crt, which you want to apply and use to establish a secure connection while accessing endpoints from your Spark notebooks or Spark applications. For example, if your external endpoint certificate isext.crt, you need to append that toca.crtas follows:cat ext.crt >> ca.crtEnsure that the contents of new
ca.crtlooks as follows:-----BEGIN CERTIFICATE----- ... existing cert ... -----END CERTIFICATE----- -----BEGIN CERTIFICATE----- ... external endpoint cert ... -----END CERTIFICATE----- -
Create a kubernetes secret. When the
ca.crtfile is ready, create a secret in the OpenShift project where is installed. The example shows using the created secretnew-certificates-chain:# create secret with new certificate chain $ oc create secret generic new-certificates-chain --from-file=ca.crt --from-file=tls.crt -n ${PROJECT_CPD_INSTANCE}This command returns the information that the
secret/new-certificates-chainwas created. -
Determine the image name that you must set in the job configuration to use for pod creation:
# Find the image to be used for the pod creation $ oc get deploy spark-hb-create-trust-store -o jsonpath="{..image}" -n ${PROJECT_CPD_INSTANCE}This command returns something like:
cp.icr.io/cp/cpd/spark-hb-truststore-util@sha256:bb1ac4bba2a201995f07de7995d1055cd571a865b60bc7fad8cbb7879f41150d -
Create a kubernetes pod to update the certificates. Run the following command to deploy the kubernetes pod, which updates the truststores used by the Analytics Engine Powered by Apache Sparkservice. Before running the command, replace REPLACE_WITH_IMAGE with the image name, which was returned in previous step.
# replace REPLACE_WITH_IMAGE with image name $ oc run spark-hb-update-certificates -n ${PROJECT_CPD_INSTANCE} --image $REPLACE_WITH_IMAGE --restart OnFailure --generator=run-pod/v1 --overrides '{"apiVersion":"v1","kind":"Pod","metadata":{"name":"spark-hb-update-certificates","labels":{"app":"spark-hb-update-certificates","run":"spark-hb-update-certificates"}},"spec":{"affinity":{"nodeAffinity":{"requiredDuringSchedulingIgnoredDuringExecution":{"nodeSelectorTerms":[{"matchExpressions":[{"key":"beta.kubernetes.io/arch","operator":"In","values":["amd64"]}]}]}},"podAntiAffinity":{"requiredDuringSchedulingIgnoredDuringExecution":[{"labelSelector":{"matchExpressions":[{"key":"run","operator":"In","values":["spark-hb-update-certificates"]}]},"topologyKey":"kubernetes.io/hostname"}]}},"containers":[{"args":["bash /opt/ibm/entrypoint/create-trust-store-and-secret.sh changeit spark-hb-java-trust-store spark-hb-os-trust-store /opt/hb/icp4d-certs"],"command":["/bin/sh","-c"],"image":"$REPLACE_WITH_IMAGE","imagePullPolicy":"Always","name":"spark-hb-update-certificates-container","resources":{"limits":{"cpu":"100m","memory":"128Mi"},"requests":{"cpu":"100m","memory":"128Mi"}},"securityContext":{"allowPrivilegeEscalation":false,"capabilities":{"drop":["ALL"]},"privileged":false,"readOnlyRootFilesystem":false,"runAsNonRoot":true,"runAsUser":1000320999},"volumeMounts":[{"mountPath":"/opt/hb/icp4d-certs","name":"icp4d-certs","readOnly":true},{"mountPath":"/opt/ibm/entrypoint/","name":"spark-hb-create-trust-store-secret-script"}]}],"restartPolicy":"OnFailure","serviceAccount":"zen-editor-sa","serviceAccountName":"zen-editor-sa","terminationGracePeriodSeconds":30,"volumes":[{"name":"icp4d-certs","secret":{"defaultMode":420,"secretName":"new-certificates-chain"}},{"configMap":{"defaultMode":420,"items":[{"key":"create-trust-store-and-secret.sh","path":"create-trust-store-and-secret.sh"}],"name":"spark-hb-create-trust-store-secret-script"},"name":"spark-hb-create-trust-store-secret-script"}],"tolerations":[{"effect":"NoExecute","key":"node.kubernetes.io/not-ready","operator":"Exists","tolerationSeconds":300},{"effect":"NoExecute","key":"node.kubernetes.io/unreachable","operator":"Exists","tolerationSeconds":300}]}}' -
Monitor the pod; it'll take around a minute to complete the task:
$ oc get pod spark-hb-update-certificates -
When the status of
spark-hb-update-certificatesis set to "running", check the logs:oc logs -f spark-hb-update-certificatesExample of the log output:
count 3 secret "spark-hb-java-trust-store" deleted exit_code : 0 count 3 secret "spark-hb-os-trust-store" deleted exit_code : 0 count 3 secret/spark-hb-java-trust-store created exit_code : 0 count 3 secret/spark-hb-os-trust-store created exit_code : 0
Parent topic: Administering Analytics Engine Powered by Apache Spark