Spark on EGO Spark application parameters

In the spark-submit command for a Spark application, you can configure resource management and session scheduling using the Spark on EGO plug-in.

If you are submitting Spark applications from the spark-submit command line in client mode, ensure that you source the profile before submitting the application; otherwise, you cannot access the Spark driver UI.
  • If you are using BASH, run the following command:
    . $EGO_TOP/profile.platform
  • If you are using CSH, run the following command:
    source $EGO_TOP/cshrc.platform
Note: For Spark parameters whose values must be enclosed within a set of quotation marks, ensure that you use single quotation marks (') instead of double quotation marks ("). For example, use '3', not "3".

Spark batch applications that are submitted from the spark-submit command, by default, run as the consumer execution user for the driver and executor. To run batch applications as the submission user, you need to select Enable impersonation to have Spark applications run as the submission user in the instance group configuration.

spark.ego.access.namenodes

In cluster mode, specifies the name node from which the program gets the delegation token.

Default: ""

spark.ego.credential

If authentication and authorization are enabled, the spark.ego.credential parameter provides a resource orchestrator token to the command when a job is submitted to the Spark REST server. The credential value is provided by the cluster management console.

spark.ego.dataconnectors

Specifies the data connector names to be used on the application. The value must contain several data connector names that are separated by commas (,). Each name must start with a letter, can have a maximum of 100 characters, and can contain uppercase and lowercase letters, numbers, and hyphens (-).

spark.ego.distribute.files

In cluster mode, specifies whether to distribute files through a shared file system. If set to true, resource files (JARs and scripts) are uploaded to the staging directory.

Default: false

spark.ego.driverEnv.[EnvironmentVariableName]

The environment variable name that is specified gets passed on to the driver when starting in cluster mode.

spark.ego.executor.idle.timeout

Determines the interval, in seconds, that an executor stays alive without running workload. Used only when environment variable SPARK_EGO_EXECUTOR_SLOTS_RESERVE (or spark.ego.executor.slots.reserve) is defined.

Default: 60

spark.ego.executor.slots.max

Specifies the maximum number of tasks that can run concurrently in one Spark executor. Define a value only after evaluating Spark executor memory and memory usage per task, to prevent the Spark executor process from running out of memory.

Default: Integer.MAX_value

spark.ego.executor.slots.reserve

Specifies the minimum number of slots to keep after the executor is started. The number of executor slots is the minimum number of reserve slots and owned slots. The timeout for reserve slots is the value of environment variable SPARK_EGO_EXECUTOR_IDLE_TIMEOUT (or spark.ego.executor.idle.timeout).

Default: 1

spark.ego.gpu.slots.max

Specifies the maximum number of slots that an application can get for GPU tasks in Spark master mode.

Default: Integer.MAX_value

spark.ego.gpu.slots.per.task
For GPU scheduling, specifies the number of slots that are allocated to Spark tasks at the Spark application level:
  • To enable one GPU Spark task to run with multiple EGO slots, specify a positive integer that is greater than or equal to one (for instance, 1, 2, or 3 are valid values). As an example, setting --conf spark.ego.gpu.slots.per.task=2 means that each task can run on a maximum of two EGO slots.
  • To enable multiple GPU Spark tasks to run with a single EGO slot, specify a negative integer that is less than -1 (for instance, -2, -3, or -4 are valid values). As an example, setting --conf spark.ego.gpu.slots.per.task=-2 means that there are two tasks running on the single EGO slot.

For troubleshooting exceptions when using this parameter, see the log files under the $DEPLOY_HOME/spark-version-hadoop-version/logs directory. Note also that if SPARK_EGO_GPU_SLOTS_PER_TASK has been set at the instance group level with a value greater than 1 (as per Spark on EGO instance group parameters), you cannot set spark.ego.gpu.slots.per.task. The instance group level configuration takes precedence.

For CPU scheduling, use spark.ego.slots.per.task, instead of this parameter.

Default: 1

spark.ego.keytab

Specifies the location of the user's keytab file.

spark.ego.master.ui.retained.executor.num

Specifies the number of retained finished executors per application to show in the Spark UI.

Default: 1000
Tip: As a best practice, to control Spark master memory usage, use a smaller value. To change this number, specify --conf spark.ego.master.ui.retained.executor.num=number_of_executors when submitting Spark applications (for example, --conf spark.ego.master.ui.retained.executor.num=500).
spark.ego.passwd

Specifies the password to authenticate the spark.ego.uname user.

Default:

spark.ego.priority

Specifies the priority of Spark driver and executor scheduling for submitted drivers and applications. The valid range for scheduling priority is 1 to 10000, with 10000 being the highest priority.

Default: 5000

spark.ego.slots.max

Specifies the maximum number of slots that an application can get in Spark master mode.

Default: Integer.MAX_value

spark.ego.slots.per.task
For CPU scheduling, specifies the number of slots that are allocated to Spark tasks at the Spark application level:
  • To enable one CPU Spark task to run with multiple EGO slots, specify a positive integer that is greater than or equal to one (for instance, 1, 2, or 3 are valid values). As an example, setting --conf spark.ego.slots.per.task=2 means that each task can run on a maximum of two EGO slots.
  • To enable multiple CPU Spark tasks to run with a single EGO slot, specify a negative integer that is less than -1 (for instance, -2, -3, or -4 are valid values). As an example, setting --conf spark.ego.slots.per.task=-2 means that there are two tasks running on the single EGO slot.

For troubleshooting exceptions when using this parameter, see the log files under the $DEPLOY_HOME/spark-version-hadoop-version/logs directory. Note also that if SPARK_EGO_SLOTS_PER_TASK has been set at the instance group level with a value greater than 1 (as per Spark on EGO instance group parameters), you cannot set spark.ego.slots.per.task. The instance group level configuration takes precedence.

For GPU scheduling, use spark.ego.gpu.slots.per.task, instead of this parameter.

Default: 1

spark.ego.slots.required
Sets the number of slots, including CPU and GPU slots that a Spark application requires to start launching tasks.
Note: If you use Spark version 2.4.3 or later and Barrier Execution Mode is enabled in the user's Spark application, do not configure the spark.ego.slots.required parameter.

Default: 0

spark.ego.slots.required.timeout

The time, in seconds, to wait for a Spark application to get the required number of slots, including CPU and GPU slots, before launching tasks. After this time, any slots that are held are released and the application fails.

Default: Integer.MAX_value

spark.ego.submit.file.replication

In cluster mode, specifies the number of replicas that the uploaded JAR file uses.

Default value is the value of dfs.replication in Hadoop core-site.xml.

spark.ego.uname

Specifies the user name to log on to the resource orchestrator.

spark.ssl.ego.gui.enabled

Specifies that the configured instance group uses HTTPS for its web UIs. This parameter is only used with the RESTful APIs. In the cluster management console, this parameter is configured as a checkbox.

Default: false

spark.ssl.ego.workload.enabled

Specifies that the configured instance group uses SSL communication between the following Spark processes: Spark master and Spark driver, Spark driver and Spark executor, and Spark executor to the shuffle service. This parameter is only used with the RESTful APIs. In the cluster management console, this parameter is configured as a checkbox.

Default: false