Setting up started tasks to start and stop Spark processes

Complete this task to set up started tasks to start and stop Spark processes.

About this task

The following list provides the benefits of using this feature.

Allows the Spark master, worker, history server, and shuffle service to run on z/OS consistent with running other MVS batch jobs, job steps, or started tasks.
- Handles START, STOP, and CANCEL command options and writes messages to the MVS console. For more information about the messages that are issued, see Open Data Analytics for z/OS System Messages.
- Can be extended to take advantage of all of the capabilities that JZOS provides. These capabilities include the use of SYSOUT for output directories, MVS data sets, and DD statements.
Maintains flexible configuration of the Java execution environment and the z/OS UNIX System Services environment that the Spark master and worker require.
Automation
- Allows the Spark master, worker, history server, and shuffle service to be managed through customer automation products and policies.
- Allows automation products to start and stop the master and worker with no parameters, with the assurance that the worker is started using the master port for which the master is actually started.
- Allows the worker to retry starting for a period of time if the master is not yet started.
- Allows enterprise automation management strategies to be applied to the Spark master and worker. These strategies include the following:
  - Started task dependencies such as staging the starting and stopping of Spark started tasks based on the availability of other started tasks. These tasks can include but are not limited to OMVS, MDS, TCPIP, and Database Servers (Db2z/OS®, IMS, and more).
  - Failure recovery by restarting the Spark master and worker on any system under automation management control.

The examples of started tasks are based on the following assumptions:

JZOS Batch Launcher and Toolkit in IBM® 64-Bit SDK for z/OS Java™ Technology Edition V8 is installed and operational. For installation and configuration instructions and information about messages and return codes from JZOS, see JZOS Batch Launcher and Toolkit: Installation and User's Guide.
If you are using Trusted Partner authentication, ensure that a PROGRAM profile is defined for the load module, JVMLDM86. For instructions, see Step 11 in Configuring additional authorities and permissions for the Spark cluster.
For each Spark cluster, the spark-env.sh contains a unique SPARK_IDENT_STRING. Do not specify $USER or allow it to default to $USER.
The user ID that starts and stops the Spark cluster is the SPARKID that is previously created the SPKGRP is the group that is previously created.
The default shell program for SPARKID is bash.
Spark is installed in /usr/lpp/IBM/izoda/spark/sparknnn, where nnn is the Spark version. For example, /usr/lpp/IBM/izoda/spark/spark23x for Spark 2.4.8.
Spark is configured as described in this document and the required environment variables are set in the following procedures. For more information, see Set and export common environment variables.
OMVS must be initialized before the master can start.
The directories that are specified by the following environment variables, or the defaults taken when not specified, must exist and have the appropriate authorities. For more information, see Creating the Apache Spark working directories.
- SPARK_HOME
- SPARK_CONF_DIR
- SPARK_LOG_DIR
- SPARK_PID_DIR
- SPARK_LOCAL_DIRS
- SPARK_WORKER_DIR