Accessing the Spark history server

The Spark history server allows you to view the stages of running and completed Spark applications on a watsonx.data instance.

If you want to analyze how different stages of your Spark application performed, you can view the details in the Spark history server UI.

Required permissions
To submit Spark runtime, you must have the User role.

About this task

Applies to :

Spark engine

Apache Gluten accelerated Spark engine

Procedure

  1. Log in to the watsonx.data console.
  2. From the navigation menu, select Infrastructure manager.
  3. Click the name of the Spark engine (from list view or topology view). The engine information window opens.
  4. In the Spark history tab, click Start history server.
  5. In the Spark history tab, click Start history server.
    1. You need to provide the following information :

      • Select the Capacity type to view the history details. You can select Dedicated (if configured) or On-demand. Choose the capacity from the list and specify the CPU cores and memory details.
  6. By default, the Spark history server consumes 1 CPU core and 4 GB of memory while running. To use more resources, enter the values for the Cores and Memory.
  7. Click Start.
    The Spark history server is started.
  8. Click View Spark history. The History Server page opens.
    From the History Server page you can:
    • View the list of completed Spark application and details such as the application ID, duration, and event log for each application.
    • Download the event log of a Spark application. Click the Download link inside the Event Log column.
    • View the details of an application. Click the application ID link. The Spark runtimes page opens. This page displays details such as the different stages of execution, the storage used, the Spark environment and executor (memory and driver) details.
    Note: The History Server displays detailed storage events only if the Spark application was submitted with the configuration parameter spark.eventLog.logBlockUpdates.enabled set to true. Log links under the Stages and Executors tabs of the Spark history server UI will not work as logs are not preserved with the Spark events. To review the task and executor logs, enable platform logging.

Automatically terminating idle Spark history servers

About this task

To optimize resource utilization, the Spark History Server is automatically terminated when it remains idle beyond a configurable duration. This prevents unnecessary consumption of compute resources when the History Server is not actively serving requests.

The idle monitoring mechanism periodically checks for user activity on the Spark History Server. If no activity is detected within the configured timeout window, the service is gracefully stopped.

serviceConfig.historyServerIdleMonitor:
    enabled: true
    idleTimeoutMinutes: 30
    checkIntervalSeconds: 60
  1. When enabled, the idle monitor tracks incoming requests to the Spark History Server UI.
  2. If no activity is detected for the duration specified by idleTimeoutMinutes, the History Server is automatically shut down.
  3. The inactivity check runs at intervals defined by checkIntervalSeconds.
  4. When new activity is required (for example, accessing the History Server UI again), the service can be restarted based on the platform’s standard lifecycle behavior.