Changing cron job configurations in Analytics Engine Powered by Spark
The Analytics Engine Powered by Apache Spark service uses cron jobs to automate repetitive tasks on the Spark cluster.
The Spark assembly cron jobs fall into two catagories:
-
Cleanup cron jobs
The Spark assembly runs auto cleanup to delete unwanted or unused Spark runtimes or Spark jobs. Auto cleanup ensures that no additional resources occupy the cluster. There are three cleanup cron jobs that run every 30 minutes by default.
-
Spark jobs are automatically cleaned up by the cron job
spark-hb-job-cleanup-cron
based on the following criteria:- The Spark jobs completed successfully but the job was not deleted by the user before the defined idle timeout exceeded.
- The Spark jobs completed but failed.
-
Spark runtimes started in Watson Studio are automatically cleaned up by the cron job
spark-hb-kernel-cleanup-cron
based on the following criteria:- The Spark runtime was created successfully and is inactive but the defined idle timeout has exceeded.
- The Spark runtime creation failed.
-
spark-hb-terminating-pod-cleanup-cron
removes all the Spark runtime pods that are stuck in theterminating
state.
To change the default schedule of the cron cleanup jobs, see Changing the cron job run frequency.
-
-
Cron jobs that cache Spark runtime images on worker nodes
-
spark-hb-preload-jkg-image
ensures that all the Spark runtime images are preloaded on the worker nodes and that the images are not garbage collected. By default, this cron job creates 40 pods every 2 hours, and makes sure that 32 pods reach completion.If you have more than 40 nodes in your cluster, you can change the configuration of the cron job to fit your cluster size. See Changing the cron job configuration to preload images on large clusters.
-
Releasing Spark runtime resources
Cleanup cron jobs determine whether Spark jobs or runtimes need to be deleted based on the value of kernelCullTime
in the Analytics Engine custom resource (CR) YAML file. By default, kernelCullTime
is set to 30 minutes.
If you want to change the cleanup frequency, you can do so by changing the value of kernelCullTime
in the CR YAML file and changing the schedule of the cleanup cron job.
-
Change the value of
kernelCullTime
in the Analytics Engine custom resource (CR) YAML file:-
Update the
kernelCullTime
property in Analytics Engine CR YAML file that was used to set up Analytics Engine Powered by Apache Spark. See Additional installation options. Then apply the changes to an existing deployed CR using the following command:oc apply -f analyticsengine-cr.yaml -n ${PROJECT_CPD_INSTANCE}
-
Wait for the Analytics Engine CR to be in
Completed
state:oc get analyticsengine -n ${PROJECT_CPD_INSTANCE}
-
-
Change the schedule of the cleanup cron job. See Changing the cron job run frequency.
Changing the cron job frequency
You can change the default run frequency of the cron cleanup jobs. For example, you can change the schedule of the spark-hb-job-cleanup-cron
job to run every hour instead of every 30 minutes.
-
View the default cron cleanup job schedule:
oc get cronjobs -l release=ibm-analyticsengine-prod -n ${PROJECT_CPD_INSTANCE}
This outputs:
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE spark-hb-job-cleanup-cron */30 * * * * False 0 18m 69m spark-hb-kernel-cleanup-cron */30 * * * * False 0 18m 69m spark-hb-preload-jkg-image 0 */2 * * * False 0 none 69m spark-hb-terminating-pod-cleanup-cron */30 * * * * False 0 18m 69m
-
Change the schedule to 1 hour by updating the
kernelCleanupSchedule
andjobCleanupSchedule
properties in the Analytics Engine CR YAML file that was used to set up Analytics Engine Powered by Apache Spark. See Additional installation options. Then apply the changes to an existing deployed CR using the following command:oc apply -f analyticsengine-cr.yaml -n ${PROJECT_CPD_INSTANCE}
-
Wait for the Analytics Engine CR to be in
Completed
state:oc get analyticsengine -n ${PROJECT_CPD_INSTANCE}
-
View the changed cron job:
oc get cronjobs -l release=ibm-analyticsengine-prod
Changing the cron job configuration to preload images on large clusters
You can change the configuration of the cron job spark-hb-preload-jkg-image
that preloads the runtime images on cluster nodes. For example, if you have 100 nodes on your cluster, you can change the number of nodes to which to preload
the runtime images.
-
Get the number of nodes and calculate the parallelism:
nodes=100 parallelism=$(($nodes + $(($nodes / 3))))
-
Change the
imagePullCompletions
property to the number of nodes you have and theimagePullParallelism
property to the calculated parallelism value in Analytics Engine CR YAML file that was to set up Analytics Engine Powered by Apache Spark. See Additional installation options. Then apply the changes to an existing deployed CR using the following command:oc apply -f analyticsengine-cr.yaml -n ${PROJECT_CPD_INSTANCE}
-
Wait for the Analytics Engine CR to be in
Completed
state:oc get analyticsengine -n ${PROJECT_CPD_INSTANCE}
Parent topic: Administering Analytics Engine Powered by Apache Spark