Spark application workflow

Spark applications include a driver program and executors, and runs various parallel operations in the cluster. There are two types of Spark applications: Spark notebook applications and Spark batch applications.

A Spark notebook application is launched by a notebook.

A Spark batch application is launched by the spark-submit command from the following ways:
  • cluster management console (immediately or by scheduling the submission).
  • CLI (using the open source spark-submit command in the Spark deployment directory) either inside or outside the cluster.
  • ascd Spark application RESTful APIs.

The following image illustrates the high-level tasks that are typically associated with Spark applications in IBM® Spectrum Conductor:

Interactive high-level flow diagram that illustrates the tasks that are associated with Spark batch application. Launching notebooks Submitting Spark batch applications Spark applications Scheduling Spark batch applications Creating shared Spark batch applications Monitoring Spark applications