WML for z/OS leverages the high-performance,
general execution engine technology of z/OS Spark, a core component of IBM® Open Data Analytics for z/OS®. Built on Apache Spark,
z/OS Spark is capable of large-scale data
processing and in-memory computing. You must install and configure z/OS Spark for WML for z/OS.
Before you begin
The procedure below requires that you run the create-spark-runtime.sh
script. The script is included in the SMP/E image for WML for z/OS. You can complete the procedure to configure and
start Spark before or during the
installation of WML for z/OS. Follow the instructions in
Installing and configuring WML for z/OS base on z/OS to run the SMP/E
program and extract the create-spark-runtime.sh script.
Procedure
- Locate the create-spark-runtime.sh script in the <install_dir_zos>/imlpython/bin directory.
- Run the create-spark-runtime.sh script as shown below:
./create-spark-runtime.sh
The script will gather needed information and then perform a sequence of tasks to create and
configure a Spark runtime environment for
WML for z/OS. The script will stop in the event of a
fatal error. You must fix the error and rerun the script.
- When prompted, respond by either making selections or entering requested information:
- Before configuring z/OS Spark,
the script checks if the Spark installation
is at the required PTF (build) level, which is 2.2.0.12 or 2.3.3. If the required level is not
satisfied, the script will stop and the configuration process will fail. In this case, apply the
required Spark PTF and rerun the
script.
- Confirm or decline to enable Spark
client authentication.
- Enter IP address for your Spark
master.
- Enter port number for your Spark master
(or enter Y to accept default port 7077).
- Enter port number for your Spark master
REST API (or enter Y to accept default port 6066).
- Enter port number for your Spark web UI
(or enter Y to accept default port 8080).
The IP addresses and port numbers that you entered or accepted are saved in the $IML_HOME/spark/conf/spark-defaults.conf and $IML_HOME/spark/conf/spark-env.sh files.
After gathering all required information, the script starts the Spark master and worker processes. If the processes
are successfully started, you will see a message similar to the following example:
Starting Spark master...
Spark master started successfully
Starting Spark worker...
Spark worker started successfully
Congratulations! You have successfully configured and started Spark.
Check the parameters used for Spark under $IML_HOME/spark/conf
If any error occurs, you will be directed to the Spark master and worker configuration logs. Review
the logs, fix the error, and start the processes manually.
- Issue the following command to start the Spark master
process:
start_master
- Issue the following command to start the Spark worker
process:
start_slave
If needed, add the start_master and start_slave commands to
the $IML_HOME/.profile
file for the <mlz_setup_userid>.
By default, the Spark master and worker
processes use SPARKM1A and SPARKW1A as their respective job names.
You can change these job names by editing the $IML_HOME/.profile file.
If you've enabled Spark client
authentication, the script will not start the Spark master and worker processes for you. You will
see a message in the following example:
Congratulations! You have successfully configured Spark.
Customize your system to use AT-TLS for client authentication.
Follow instructions at https://www.ibm.com/support/knowledgecenter/
SS3H8V_1.1.0/com.ibm.izoda.v1r1.azka100/topics/azkic_c_configclientauth.htm.
Run start-master.sh to start Spark master and spark-slave.sh to start Spark
worker: spark-slave.sh spark://$MLZ_SPARK_HOST:$MLZ_SPARK_PORT.
You must customize your system to use AT-TLS for client authentication by following instructions
in both Configuring client authentication for z/OS Spark and Configuring
client authentication for z/OS Spark. After you complete your system setup, start the
Spark master and worker manually.
- Verify that z/OS Spark is successfully
configured and started on your system.
Issue the following command to retrieve the name of the Spark example jar file:
ls -ls $SPARK_HOME/examples/jars | grep examples
Issue the following command to verify the availability of Spark master and Spark worker:
$SPARK_HOME/bin/spark-submit --class org.apache.spark.examples.SparkPi
--master spark://<host_IP_address>:<sparkMaster-port>
$SPARK_HOME/examples/jars/<spark-examples_xxx.jar>
where xxx is the version of Spark jar file. Spark is properly configured and functions normally
if you see a response in the following example:
Pi is roughly 3.13742.
What to do next
The create-spark-runtime.sh script configures your z/OS Spark with a preset (default) attributes as a starting
point. You can review these attributes and settings, along with the IP addresses and port numbers
you entered for Step 3, in the $IML_HOME/spark/conf/spark-defaults.conf and $IML_HOME/spark/conf/spark-env.sh files.
Adjust your Spark configuration and
optimize its performance based on your actual workload over time. See Configuring Workload Manager for z/OS Spark and Memory and
CPU configuration options for instructions.