Monitoring Spark applications

You can monitor Apache Spark clusters and applications to retrieve information about their status. The information retrieved for each application includes an ID that uniquely identifies the application.

Note:
  • Each Spark user has exactly one Spark cluster. A single Db2® Warehouse system can host at most 10 Spark clusters. A cluster that has no applications currently running in it is said to be idle. The system removes an idle cluster when its resources are needed, for example, during a cleanup action or when an application is submitted that would otherwise require an 11th cluster. Therefore, it can happen that an idle cluster is removed before you have a chance to monitor its (no longer running) applications.
  • From Spark version 2.0, you can no longer access event information, such as jobs, tasks, data sizes, and so on, for completed jobs from the Spark monitoring pages. However, Spark event data is still collected and stored as JSON data that you can download in the app-application_id file as described in Checking the results of a Spark application or cluster.
    For a formatted HTML display of the JSON data, you can use a local Spark installation by running the following command:
    sbin/start-history-server.sh <path-to-directory-containing-JSON-file>

Monitoring Spark clusters and applications using the Db2 Warehouse web console

To monitor Spark clusters and applications on Db2 Warehouse:

  1. Open the Db2 Warehouse web console.
  2. Do either of the following steps:
    • Click Analytics > Spark Analytics > Open the Spark Application Monitoring Page.
    • Click Monitor > Workloads, and then click the Spark tab. This page displays the user names of the clusters that you are authorized to monitor and the number of applications that are currently running in each cluster. Click a user name to open to the Spark monitoring page for the corresponding cluster.

Monitoring Spark applications using the Livy web user interface

Note: You can use this method only if you launch the Spark application through an Apache Livy server as described in Launching a Spark application through an Apache Livy server.
To use the web user interface, enter the following URL:
http://host:8998

Monitoring Spark clusters and applications using the Spark command-line tool

Use the spark-submit.sh script to issue commands that return the status of your cluster or of a particular application. The setting of the DASHDBJSONOUT environment variable or the presence or absence of the --jsonout option determines in which format (JSON or readable) output is returned.
spark-submit.sh --cluster-status
This command retrieves the status of your Spark cluster. For example:
status: Running
statusDesc: Cluster is running.
running jobs: 1
spark-submit.sh --app-status submission_ID
This command retrieves the status and application ID of the application with the specified submission ID. (To find the submission ID, check the output of the command that was used to submit the application.) For example:
status: Running
statusDesc: Application is running.
applicationId: app-20160815210613-0000
submissionId: 20160815210608126000
spark-submit.sh --list-apps
This command retrieves information about all applications that are currently running or that ran since the cluster was last started. For example:
Submission ID               Application ID                 Status
-----------------------     --------------------------     ---------
20160830111241465001        app-20160830113805-0004        Ended
20160830111241465002        app-20160830113807-0006        Ended
20160830111241465003        app-20160830113809-0003        Ended

Monitoring Spark clusters and applications using the IDAX.APP_STATUS stored procedure

From within a database connection, issue a CALL statement that calls the IDAX.APP_STATUS stored procedure to retrieve the status of a particular application. The function returns the application status information as an SQL result set.

Monitoring Spark clusters and applications using the IBM Db2 Warehouse Analytics API

Use the IBM® Db2 Warehouse Analytics API to submit an HTTP GET request that calls one of the following endpoints:
  • /dashdb-api/analytics/public/monitoring/cluster_status
  • /dashdb-api/analytics/public/monitoring/app_status
For example:
  • To retrieve the status of your cluster (or, if you are an administrator, of all clusters), issue the following cURL command (replace the user ID, password, and host name):
    curl --user "userid:password" 
      -X GET "https://hostname:8443/dashdb-api/analytics/public/monitoring/cluster_status"
    The result contains the status of all the clusters that you are authorized to monitor and the number of applications that are currently running in each cluster. For example:
    { "statusDesc":"The cluster is running.","resultCode":200,"clusters":[
      {"running_apps":0,"monitoring_url":"\/sparkui\/user1","username":"user1"}],"username":"user1","status":"running"}
  • To retrieve the status of all the Spark applications in your cluster, issue the following cURL command (replace the user ID, password, and host name):
    curl --user "userid:password" 
      -X GET "https://hostname:8443/dashdb-api/analytics/public/monitoring/app_status"
    The result contains the status of all the Spark applications in your cluster. For example:
    { "statusDesc":"Applications are currently running.", "resultCode":200, "username":"user1", "status":"running",
       "apps":[  
          { "submissionId":"20160701085716886000",
             "exitInfo":{ "code":"", "details":[], "message":""},
             "applicationId":"app-20160701085718-0012",
             "status":"running"
          },
          {  
             "submissionId":"20160701084709860000",
             "exitInfo":{ "code":"0", "details":[], "message":""},
             "applicationId":"app-20160701084750-0010",
             "status":"ended"
          }
       ]
    }
    This example shows two Spark applications:
    • The application with submission ID 20160701085716886000 is running.
    • The application with submission ID 20160701084709860000 has ended.
    Note: The result code (200) refers to the request for status information, not to the applications about which information was retrieved.
  • To retrieve the status of an individual application, specify its submission ID in the request. For example:
    curl --user "userid:password"" 
      -X GET "https://hostname:8443/dashdb-api/analytics/public/monitoring/app_status?submissionid=20160701085716886000"