Submit and monitor a Spark batch application
In this lesson, you submit and monitor a Spark batch application.
About this task
A Spark application is submitted either by the spark-submit command (known as a Spark batch application) or from a notebook. It includes a driver program and executors, and runs various parallel operations in the cluster. Spark batch applications, by default, run as the consumer execution user for the drivers and executors.
--class main-class application-jar [application-arguments]
- --class main-class is the fully qualified name of the class that contains the main method for the Java and Scala application. For SparkPi, the main class would be org.apache.spark.examples.SparkPi.
- application-jar is the .jar file that contains your application and all its dependencies. For SparkPi, this value might bedeployment_dir/spark-2.0.1-hadoop-2.7/examples/jars/spark-examples_2.11-2.0.1.jar.
- (Optional) application-arguments are any arguments that must be passed to the main method of your main class.
You can use the open source spark-submit command on the CLI from the directory Spark was deployed to. For this tutorial you use the cluster management console.
You can also submit batch applications by using RESTful APIs. The system converts the command to a REST command that you can use with cURL or other tools and scripting languages to submit the batch application.
In this lesson, you submit SparkPi, a sample Spark application that is packaged with Spark and computes the value of Pi.
Concept | Description |
---|---|
finished state | Indicates that application execution is complete. |
Spark master | An application option that specifies the master URL of the Spark instance group to submit the batch application. If required, click Change master to specify the Spark master URL to where the batch application should be submitted. |
running state | Indicates that application execution is in progress. |
Spark application | An application started by using either the spark-submit command or a notebook. A Spark application has a driver program and runs various parallel operations in the cluster. |
Spark batch application | An application started by using the spark-submit command. |
To submit a new Spark batch application:
Procedure
To monitor the application:
Results
Summary
In this lesson, you learned how to submit and monitor a Spark batch application.