Session scheduling

IBM® Spectrum Conductor includes a session scheduler daemon, which schedules resources at the application level.

The session scheduler in IBM Spectrum Conductor works similar to the Spark standalone mode, where a primary host is used for application and resource management. But unlike the Spark standalone mode, the session scheduler does not use a worker to manage a driver/executor. Instead, the resource orchestrator uses its own process execution manager (PEM), an agent that provides execution capabilities, such as starting and stopping of processes or containers.

With the session scheduler, an application does not connect to the resource orchestrator directly. Instead, it registers to the primary host and acquires resources from the primary host. The primary host, in turn, holds a connection to the resource orchestrator and consolidates resources from all applications. The primary host then acquires resources from the resource orchestrator and offers these resources to different applications.

Resources can be scheduled not only within an application, but also between applications. The scheduling policy is defined by the SPARK_EGO_APP_SCHEDULE_POLICY parameter in the Spark version configuration of a instance group. The following scheduling policies are supported:
First-in, first-out scheduling policy
First-in, first-out (FIFO) scheduling meets the requirements of applications submitted earlier first. With this policy, before offering executor resources, applications are sorted by submission time and the scheduler allocates resources for applications that were submitted earlier. Applications that are submitted earlier get all the demanded resources that the scheduler can provide. Applications that are submitted later get the remaining resources.
Priority scheduling policy
Priority scheduling meets the requirements of higher priority applications and applications submitted earlier first. With this policy, before offering executor resources, applications are sorted by priority and submission time. The scheduler allocates resources first by priority, followed by submission time, which means that a higher priority application gets all demanded resources that the scheduler can provide. Lower priority applications get the remaining resources. The application priority is defined by the SPARK_EGO_PRIORITY parameter.

If all applications have the same priority, the priority scheduling policy becomes FCFS (First Come First Served), which means applications submitted earlier get all the demanded resources that the scheduler can provide. Applications that are submitted later get the remaining resources.

The number of tasks in different stages of a job may fluctuate widely. High-priority jobs or jobs with an earlier submission time may want to get more resources while the resources are occupied by low priority jobs. To control behavior, you can tune the SPARK_EGO_SLOTS_MAX parameter to control the maximum number of slots a driver can get.

Reclaim: When a higher priority application is submitted or a higher priority application's demand increases, the scheduler reclaims resources from other applications to meet the requirements of the higher priority application.

Fair-share scheduling
Fair-share scheduling shares the entire resource pool among all submitted applications. Applications get their own deserved number of resources according to their priority.

The SPARK_EGO_PRIORITY and SPARK_EGO_SLOTS_MAX parameters can be used for fair-share scheduling also.

Reclaim: Under-allocated drivers reclaim over-allocated drivers to get resources to run more tasks in parallel.

The following figure illustrates the flexibility of the scheduling policies, where the session scheduler partitions resources for each tenant (or instance group) based on the scheduling policy:
Session scheduler partitions resources for each tenant based on the scheduling policy