Configuring Spark application security settings for users

You can define Spark security settings that enable authentication, authorization, and impersonation for the instance group. By default, the security settings are unselected.

Before you begin

Restriction: If security settings are enforced at the cluster level, you cannot change these settings for the instance group. Talk to your cluster administrator for more information.

Based on your requirements, ensure that you meet the requirements to create an instance group. See Prerequisites for an instance group.

About this task

When you create the instance group, you can configure the Spark master to authenticate and then authorize the user who is submitting Spark applications. This limits and controls who can submit Spark applications to the Spark master.

You can also enable impersonation to have Spark applications, notebook services, or both, run as the submission user. Impersonation means that the system runs executables under a designated OS account.
Notes:
  • Enabling impersonation without authentication and authorization allows Spark applications to run as any user.
  • If you are submitting Spark applications on a host inside the cluster in client mode, you must ensure that the user who is logged in to the client host and either the submission user (impersonation is enabled) or the consumer execution user (impersonation is disabled) for the Spark executors are the same user or you receive a permission issue.
  • If you select Enable impersonation to have Spark applications run as the submission user when creating the instance group, the user of either the spark.ego.uname or spark.ego.credential parameters must be the LDAP or OS execution user, rather than a built-in user, such as Admin.
  • If you are configuring notebooks for your instance group and you select Enable impersonation to have Spark applications run as the submission user when creating the instance group, the user that logs in to the notebook must be the LDAP or OS execution user, rather than a built-in user, such as Admin or Guest. Spark applications will run as the submission user. Additionally, the built-in Jupyter notebook supports notebook user impersonation; to indicate that notebook services and Spark workload should run as the notebook owner OS user, select the Supports user impersonation option when adding the notebook.

Procedure

  1. From the cluster management console, click Workload > Instance Groups to view a list of existing instance groups, select the instance group to work with, and click Configure.
  2. In the Spark tab, select Enable authentication and authorization for the submission user.
    • Selected, the Spark master authenticates and authorizes the specified user. When you run a Spark application from the spark-submit command, you can add the parameters spark.ego.uname and spark.ego.passwd. For example:
      ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master spark://MasterHost:7077 --conf spark.ego.uname=UserName 
      --conf spark.ego.passwd=Password $SPARK_HOME/lib/spark-examples-1.6.1-hadoop2.6.0.jar 100
    • Unselected, the Spark master trusts all specified users. No password is required. When you run a Spark application from the spark-submit command, you can add the spark.ego.uname with the user name. For example:
      ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master spark://MasterHost:7077 --conf spark.ego.uname=UserName 
      $SPARK_HOME/lib/spark-examples-1.6.1-hadoop2.6.0.jar 100
  3. Select Enable impersonation to have Spark applications run as the submission user.
    • Selected, Spark applications run as the submission user. Spark applications will run as the submission user. Additionally, the built-in Jupyter notebook supports notebook user impersonation; to indicate that notebook services and Spark workload should run as the notebook owner OS user, select the Supports user impersonation option when adding the notebook.
    • Unselected, Spark applications run as the consumer execution user for the driver and executor.
  4. Click Modify Instance Group.

What to do next

  1. Finish configuring the basic settings for the instance group. See Defining basic settings for an instance group.