Enabling instance groups for cloud bursting

Enable cloud bursting for instance groups in your cluster so that compute hosts from the cloud can be provisioned when applications associated with those instance groups require extra resources to run workload.

Before you begin

About this task

When an instance group is enabled for cloud bursting through host factory, the built-in cws requestor monitors the instance group's workload. Any time the instance group meets conditions defined for cloud bursting in your cluster, the cws requestor triggers scale-out requests to include cloud hosts and scale-in requests to release those cloud hosts.

An instance group is considered for cloud bursting only when it is in the STARTED, PROCESSING, or ERROR state.

Note: Take into account the following considerations when you enable instance groups in your cluster for cloud bursting:
  • Notebook workloads are not considered for bursting calculations. Do not enable cloud bursting for an instance group with notebooks associated.
  • Spark workload that is long running and not expected to terminate, such as streaming applications, can reduce the accuracy of workload profiling. Do not enable cloud bursting for an instance group that processes long-running Spark workload that is not expected to terminate.

Procedure

  1. When creating an instance group, select the Enable cloud bursting with host factory check box. See Setting consumers and resource groups for an instance group.
  2. Ensure that the instance group specifies one of the hybrid resource groups in its Spark drivers, Spark executors, or Spark shuffle service resource group parameters. A hybrid resource group is a resource group to which hosts provisioned from the cloud are assigned through a post-provisioning script. A hybrid resource group can either include local and cloud hosts or only cloud hosts, and can be dedicated per tenant (as different tenants might own different cloud accounts). A post-provisioning script might be dedicated to a tenant. See Adding cloud hosts to resource groups.
    Note:
    • Ensure that the Spark master for the instance group does not run in the ManagementHosts resource group to work around a Spark limitation, which prevents the Spark master from binding to public IPs.
    • Ensure that the Spark master for the instance group is assigned to run in a resource group with enough resources available for running the Spark master. Only Spark master processes that are running can trigger cloud resource provisioning.
    • Ensure that the Spark drivers of the instance group are assigned to run in a resource group whose cloud hosts cannot be reclaimed while drivers are running on these cloud hosts.

What to do next

  1. Submit workload to instance groups that are enabled for cloud bursting. See Submitting Spark batch applications.
  2. Monitor cloud resource requests. See Monitoring cloud bursting in your cluster.