IBM Support

How to run self-contained Java Spark application in Conductor Spark cluster?

How To


Summary

As we know, Conductor can be used to run Spark applications from Conductor GUI in two major use cases: batch application and interactive application. But there are use cases to submit application code from command line. This tech note is about how to submit self-contained Java application from command line and run in it Conductor Spark cluster.
Self-contained application is application code that doesn't require "spark-submit" command to launch. It uses "java" command with spark parameters in command line to run the application. Self-contained Java code example can be found in https://spark.apache.org/docs/latest/quick-start.html.

Here are steps how to submit this application to Conductor Spark cluster:

1 Export and install Conductor external client from the SIG which provides the Spark cluster where you want your application to run.

2 Set environment variables: JAVA_HOME and SPARK_HOME
For example:
[root@host]# env | grep JAVA_HOME
JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.272.b10-1.el7_9.x86_64
[root@host]# env | grep SPARK_HOME
SPARK_HOME=/opt/cws-external-client/spark-2.4.3-hadoop-2.7

3 Source Spark env
For example,
[root@host]# set -o allexport
[root@host]# source <Conductor external client path>/conf/spark-env.sh

4 Construct java application command line like following
[root@host]#$JAVA_HOME/bin/java
-cp /opt/cws-external-client/spark-2.4.3-hadoop-2.7/jars/ego/*:/opt/cws-external-client/spark-2.4.3-hadoop-2.7/jars/*:/opt/cws-external-client/spark-2.4.3-hadoop-2.7/examples/jars/*
-Dspark.master="spark://<spark master host in the SIG>:7078"
-Dspark.executor.memory=2g
...<put all the settings from spark-defaults.conf as parameters in the CLI>
org.apache.spark.examples.SparkPi 100

The application should be able to run in Conductor Spark cluster and return results.

Document Location

Worldwide

[{"Line of Business":{"code":"LOB77","label":"Automation Platform"},"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SS4H63","label":"IBM Spectrum Conductor"},"ARM Category":[{"code":"a8m500000008Pb3AAE","label":"General"}],"ARM Case Number":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Version(s)"}]

Log InLog in to view more of this document

This document has the abstract of a technical article that is available to authorized users once you have logged on. Please use Log in button above to access the full document. After log in, if you do not have the right authorization for this document, there will be instructions on what to do next.

Document Information

Modified date:
07 December 2020

UID

ibm16379206