IBM Support

Introducing hybrid cloud capability in IBM Spectrum Conductor with Spark 2.2.1

Technical Blog Post


Abstract

Introducing hybrid cloud capability in IBM Spectrum Conductor with Spark 2.2.1

Body

IBM Spectrum Conductor with Spark 2.2.1 introduces a new cloud bursting feature that enables you to dynamically burst workloads from an IBM Spectrum Conductor with Spark cluster to cloud hosts. When workload resource demand exceeds the capacity of resources in the cluster, additional cloud hosts are provisioned and added to the cluster to meet the resource demand. When there is excess capacity in allocated cloud hosts, this excess capacity is returned to the cloud providers.

image

The cloud bursting capability introduced in IBM Spectrum Conductor with Spark 2.2.1 provides several benefits:

  • Cost savings: Rather than spend money to build and maintain infrastructure to accommodate spikes in resource usage that occur only occasionally, the cloud bursting mechanism allows you to offload workload from the on-premises infrastructure to public clouds, and therefore pay for the additional infrastructure only when it is needed, and reduce the total cost of ownership.
  • Flexibility: With the cloud bursting mechanism you have the flexibility to use multiple cloud providers and different types of infrastructures and resources, depending on considerations such as workload requirements and resource costs.
  • Improved Security: Cloud providers can offer increased security, isolation of servers, and communication over a private network. Combining on-premises and cloud resources can address security and compliance aspects.
  • Scalability: The cloud bursting mechanism provides scalability by extending the cluster dynamically to cloud providers. In addition, by leveraging resources of cloud providers using the cloud bursting mechanism, the exposure to outages and downtime can be minimized.

IBM Spectrum Conductor with Spark 2.2.1 uses a system service called HostFactory to connect with cloud providers. This service runs within an IBM Spectrum Conductor with Spark cluster, and uses plug-ins to communicate with IBM Spectrum Conductor with Spark and with cloud providers.

The IBM Spectrum Conductor with Spark requestor plug-in monitors the workloads of Spark instance groups that are enabled for cloud bursting, and monitors the states and utilization of the cloud hosts in the cluster. Based on the information gathered, the plug-in calculates scale-out and scale-in requests, and provides these requests to the HostFactory service. Scale-out requests enable to add cloud hosts to the cluster, and are generated when workload demand exceeds resource capacity in the cluster. Scale-in requests enable to return cloud hosts to the cloud providers, and are generated when there is excess resource capacity in the cluster. The provider plug-ins work with cloud provider interfaces to provision and return hosts. In IBM Spectrum Conductor with Spark 2.2.1, the supported cloud providers are Amazon Web Services and IBM Cloud (SoftLayer).

image

 

High level architecture of the IBM Spectrum Conductor with Spark requestor plug-in

The HostFactory service uses cloud provider plug-ins to communicate with Infrastructure as a Service (IaaS) cloud services, provision cloud hosts, and return cloud hosts.

The responsibilities of the HostFactory service are:

  1. Allocating hosts from cloud providers.
  2. Returning named hosts to cloud providers.

To obtain requests for provisioning of cloud hosts and returning of cloud hosts, the HostFactory service activates the IBM Spectrum Conductor with Spark requestor plug-in (every 30 seconds by default).

The responsibilities of the IBM Spectrum Conductor with Spark requestor plug-in are:

  1. Calculating the requirements for additional resources for waiting and running workloads to meet their given completion times.
  2. Calculating cloud hosts to return to the cloud providers based on resource utilization information and workloads expected completion information.
  3. Performing the procedure to detach cloud hosts from the cluster prior to returning the hosts to the cloud providers.

The IBM Spectrum Conductor with Spark requestor plug-in communicates with the IBM Spectrum Conductor with Spark cluster services. It gets a list of Spark master services running in the cluster from the ASCD service of the IBM Spectrum Conductor with Spark cluster, and it gets lists and attributes of waiting and running Spark applications from the Spark master services running in the cluster. It also gets host level statistics from the resource management layer (EGO) of the IBM Spectrum Conductor with Spark cluster.

The IBM Spectrum Conductor with Spark requestor plug-in maintains workload profiles that facilitate the resource requirements calculations, and maintains host level information that facilitates the host return operations. This information is maintained in a metadata store.

A configuration file of the IBM Spectrum Conductor with Spark requestor plug-in enables you to customize parameters controlling the scale-out and scale-in calculations, and further administrative parameters. Online modification of this configuration is supported.

image

 

Configuring and activating the cloud bursting feature

  1. Create an account with a supported cloud service provider, and obtain the credentials for API access for the cloud account.
  2. Prepare a VM image and obtain the image ID. You can prepare either a VM image that is preinstalled with IBM Spectrum Conductor with Spark, or a base VM image (without IBM Spectrum Conductor with Spark).
  3. Prepare a post-provisioning script that runs on a cloud host after it is provisioned, by customizing a post-provisioning script that is included with IBM Spectrum Conductor with Spark. Specifically, you can customize the resource groups that provisioned hosts join. These resource groups are called “hybrid resource groups”, and can include both local and cloud hosts.
  4. Configure cloud provider records in the HostFactory providers configuration file hostProviders.json. A record specifies the name of the provider and the paths to the provider’s plug-in executables and configuration.
  5. Configure the cloud providers’ specific configurations, in configuration files that are associated with each cloud provider. These configuration files include parameters such as authentication information and host template details.
  6. Configure the IBM Spectrum Conductor with Spark requestor record in the HostFactory requestors configuration file hostRequestors.json. The record includes the name of the requestor, the paths to the requestor’s plug-in executables and configuration, and a list of cloud providers to be used for this requestor.
  7. Configure the parameters that control the operation of the IBM Spectrum Conductor with Spark requestor plug-in, using the cws_config.json configuration file. An overview of these parameters is specified in the following Scale-out and scale-in configurations section of this blog. These parameters can be modified online.
  8. Start the HostFactory service.
  9. Enable cloud bursting for Spark instance groups from the cluster management console:
    1. Enable the Spark instance group for cloud bursting by selecting the Enable cloud bursting with host factory check box in the Spark instance group properties.
    2. To utilize cloud hosts, a Spark instance group that is enabled for cloud bursting must specify a hybrid resource group in its Spark drivers, Spark executors, and Spark shuffle service resource group parameters.

 

Scale-out and scale-in configurations

The method for selecting workloads for cloud bursting based on their state can be configured with the following options:

  1. Workloads that are waiting for processing and workloads that are running are considered for bursting.
  2. Workloads that are waiting for processing are considered for bursting.
  3. Workloads that are waiting for processing more than a specified amount of time are considered for bursting.

For each workload class, the following can be configured:

  1. A required duration for a workload of the specified class to complete its processing from its submission time.
  2. A duration beyond which a workload of the specified class, that is in a waiting state, is considered for bursting.

These parameters can be specified either for a workload class or globally for any workload that does not have these parameters specifically configured.

A workload class is a defined group of workloads that have similar characteristics and processing behaviors. The IBM Spectrum Conductor with Spark requestor plug-in maintains a profiling record for each workload class, where the record aggregates statistics based on samples that are collected for workloads within that workload class. This information enables to calculate estimated resource requirements in the bursting calculations. This information includes an average processing duration and an average compute slots consumption for the workload class.

Billing cycle parameters, that specify billing cycle information for specified cloud providers or for the default case, can be configured. This information is used in the host return calculations to optimize resource utilization. The information includes the duration of the billing cycle, and the start and end times of the return window relative to the billing cycle duration. For cloud providers for which no billing cycle information is specified, the default information is used.

 

Monitoring and configuration from the cluster management console

You can view information on the cloud bursting activity and update configuration parameters by using the cluster management console:

  1. Click Resources > Cloud > Requests to access the cloud requests page, which shows a list of cloud requests sorted by submission time.
  2. Click Resources > Cloud > Hosts to access the cloud hosts page, which shows a list of cloud hosts, both provisioned from cloud providers and returned to the cloud providers.
  3. Click Resources > Cloud > Configuration to access the cloud configuration page, where you can update the HostFactory service configuration and the cloud providers’ host templates.

 

Additional resources

For further information on the cloud bursting capabilities of IBM Spectrum Conductor with Spark 2.2.1 please refer to the Knowledge Center documentation.

If you want to try out IBM Spectrum Conductor with Spark 2.2.1, you can download the evaluation version here! If you have any questions about cloud bursting with host factory, or for any general inquiries, post them in our forum or join us on Slack!

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SS4H63","label":"IBM Spectrum Conductor"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

UID

ibm16163503