Defining basic settings for an instance group

Follow these steps to create an instance group with the basic settings to deploy the instance group.

Before you begin

Based on your requirements, ensure that you meet the requirements to create an instance group. See Prerequisites for an instance group.

About this task

You can create an instance group with the minimal configuration, wherein you define a few required settings and use default values for others. While you can choose to modify all settings to meet your requirements, use the default configuration to quickly create and deploy an instance group.

Procedure

From the cluster management console, click Workload > Instance Groups.
Optional: To use an instance group template, click the icon and then click Use.

Note: To view the full list of final and draft templates, click Show drafts.

For more information on instance group templates, see Instance group templates.
In the Basic Settings tab, specify the required settings for the instance group.
1. Enter a name for the instance group. The instance group name must start with a letter and can contain uppercase and lowercase letters, numbers, and hyphens (-), up to a maximum of 80 characters.
2. Specify the deployment directory for the instance group. Consider the following requirements when you specify the deployment directory:
  - The deployment directory must be unique for each instance group. Otherwise, deployment fails.
  - Ensure that the deployment directory is large enough to accommodate all of the application logs that are stored.
  - For instances (instance groups, Anaconda distribution instances, and application instances) deployed to a shared file system, specify an instance group deployment directory on the shared file system.
  - For notebook deployment, the instance group execution user should be the same as the Anaconda execution user. Otherwise, you must ensure that the instance group execution user has write permission to the Anaconda deployment directory used for deploying notebooks.
3. Specify the execution user who has permission to run workload. The execution user owns all files that are created when the instance group is deployed. The file group is the execution user's primary group.
4. Optional: Specify the administrator user group for the instance group. The administrator user group is assigned permission to all the directories and files within the instance group deployment directory tree.
  
  The specified administrator user group must satisfy that the execution user of the instance group must be a member of the specified administrator user group. If not satisfied, or if the provided administrator user group is not valid, the deployment will fail.
  
  If you do not provide the administrator user group here, the system assigns the primary user group of the instance group execution OS user to all the directories and files within the instance group deployment directory tree.
Optional: Change the default Spark version that the instance group must use. For more details, see Configuring Spark versions for an instance group.
Optional: Create a conda environment or link an existing conda environment for the instance group to use. For more details, see Create a conda environment for an instance group.
Optional: Select the notebook to deploy with the instance group. For more details, see Enabling notebooks for an instance group.
Optional: Select security settings that enable authentication, authorization, and impersonation for the instance group. For more details, see Configuring security settings for an instance group.
Optional: Enable high availability for the Spark master service by specifying the directory to store the recovery state. If high availability is not enabled, when the Spark master service goes down, running applications are interrupted and cannot recover after the Spark master restarts.

The recovery directory is set as the value of the spark.deploy.recoveryDirectory parameter for the instance group. If your configuration defines a default recovery directory for all instance groups (CONDUCTOR_SPARK_DEFAULT_RECOVERY_DIR in ascd.conf), the recovery state for the instance group is stored in a subdirectory under this default directory.

If you change the default directory, the recovery directory that you specify must be an existing shared directory and the execution user for the instance group must have read/write/execute (rwx) permissions for that directory. If you change the default consumer for each component, the execution user for the Spark batch master service and the Spark notebook master service must have read/write/execute (rwx) permission to this directory.
Note: If you specify this directory to a shared NFS directory, you must manually clean up data under this location when the instance group is removed.

With high availability enabled, when the Spark master goes down, any allocations for the Spark master are retained for the duration that is specified by SPARK_EGO_CLIENT_TTL. This duration is by default 30 minutes. If the Spark master restarts within the specified duration, the Spark master retrieves those allocations for reuse.
Optional: Change the default number of days to retain monitoring data for applications that run in the instance group.
Monitoring data is stored by default for 14 days after which it is automatically deleted to avoid performance issues as the data grows over time.
Optional: If you have the required permissions, select or search for an EGO user account from the drop-down list to request resources for applications in the instance group. Set this field when you want instance group services to use the credentials of the EGO user account, instead of the one who creates the instance group. By default, the credentials of the EGO user logged in to the cluster management console are used.

What to do next

Optionally, define other settings for the instance group:
- Consumers and resources: The default top-level consumer is a consumer with the same name as the instance group (for example, if your instance group name is ABC, then the default top-level consumer is /ABC). The top-level consumer represents the entire cluster and all its resources. The default resource group is ComputeHosts. To modify either of these, see Setting consumers and resource groups for an instance group.
- Spark Shuffle service: To use the shuffle service when your instances (instance groups, Anaconda distribution instances, and application instances) are deployed to a shared file system, see Enabling and configuring the Spark shuffle service.
- Containers: To set up the instance group to run within Docker or cgroup containers, see Setting Docker/cgroup container definitions for an instance group. Note that the Containers tab is only available with certain Spark versions. Spark versions not supported: 1.5.2.
- Packages: To add any extra packages that applications in the instance group require, see Adding dependent packages.
- Data Connectors: To add data connectors to the instance group, see Adding data connectors.
Create and deploy the instance group.
- Click Create and Deploy Instance Group to create the instance group and deploy its packages simultaneously. In this case, the new instance group appears on the Instance Groups page in the Ready state. Verify your deployment and then start the instance group.
- Click Create Only to create the instance group but manually deploy its packages later. In this case, the new instance group appears on the Instance Groups page in the Registered state. When you are ready to deploy packages, deploy the instance group and verify the deployment. Then, start the instance group.