Overview

IBM Spectrum Conductor confidently deploys modern computing frameworks and services for a multitenant enterprise environment, both on-premises and in the cloud.

As an end-to-end solution, IBM Spectrum Conductor offers the following benefits:
  • Provides multitenancy through instance groups. You can deploy modern computing frameworks and services, such as Spark and Anaconda efficiently and effectively, supporting multiple versions and instances of each framework and service.
  • Increases performance and scale through granular and dynamic resource allocation for instance groups that share a resource pool.
  • Maximizes usage of resources and eliminates silos of resources that would otherwise each be tied to separate application implementations.
  • Integrates with Docker so that you can run instance groups in Docker containers.
  • Integrates with IBM Data Science Experience (DSX) Local or Desktop so that you can use notebooks from DSX with instance groups to submit Spark workload.
  • Provides flexible and efficient data management for shared storage and high availability by connecting to existing storage infrastructure, such as NFS mounts to a file system or IBM Spectrum Scale. IBM Spectrum Scale specifically is a cluster file system that provides concurrent access to a single file system or set of file systems from multiple hosts. It enables high-performance access to this common set of data to support a scale-out solution and provide a high-availability platform.

Instance group workflow

Within IBM Spectrum Conductor, a instance group is an installation of Apache Spark that can run Spark core services (Spark master, shuffle, and history), Anaconda distribution instances, and notebooks as configured. You can create and run multiple instance groups, associating each instance group with different Spark version packages as required.

instance groups compare to the Spark notion of tenants and provide multitenancy in IBM Spectrum Conductor.

The following diagram illustrates at a high level how instance groups and Spark applications are created and managed in IBM Spectrum Conductor:

Interactive diagram that shows the tasks that are associated with Spectrum Conductor with Spark. Creating Spark instance groups Verifying Spark instance groups Deploying Spark instance groups Starting Spark instance groups Controlling Spark instance groups Stopping Spark instance groups Modifying Spark instance groups Removing Spark instance groups Creating a consumer Creating resource groups Creating notebooks and assigning notebook owners Adding Spark versions Creating dependent packages Spark application workflow Notebook workflow Anaconda workflow Deploying Anaconda distributions
  1. Create and verify a instance group.

    Optionally, you can create consumers, resource groups, notebooks, add Spark versions, or create dependent packages.

  2. Deploy Spark (including any dependent packages and notebooks) to hosts in the instance group.
  3. After deployment is complete, you can manage it from the Instance Groups page in the cluster management console. You can also remove it, map notebook to users, or add users.
  4. Next, you can submit Spark applications to the instance group, monitor Spark applications, manage Anaconda distribution instances, and launch notebooks.

Application instance workflow

IBM Spectrum Conductor provides resource management and service orchestration capabilities for a broad set of cloud-native application frameworks. As Spark applications are deployed to production environments, they increasingly require integration with other frameworks. IBM Spectrum Conductor provides generalized service control capabilities that support concurrent management on multiple instances and versions of application frameworks, such as Kafka, MongoDB, and Cassandra.

With IBM Spectrum Conductor, you can create application instances for long-running services that support your Spark applications by using an application template. Through application instances, you can enable these long-running services to share resources and coexist on the same infrastructure. You can also configure the application instance to include Dockerized services that run in Docker containers.

The following high-level diagram illustrates the basic tasks that are typically associated with creating and using application instances:
Interactive high-level diagram that illustrates the tasks that are associated with application
container workload. Click a box for more information, or shift-click to open a new browser. Creating an application template Creating service packages Creating resource groups Creating a consumer Adding service packages to the repository for an application instance Registering application instances Verifying application instances Controlling application instances Deploying application instances Unregistering application instances Modifying application instances
  1. Create an application template.
  2. Create the packages that are based on the application template.

    When you register an application instance, add the created packages to the service package repository and specify the consumers that you want to use.

    Alternatively, to define your own resource groups and consumers for use by multiple application instances, create the resource groups and consumers and add the packages to the repository ahead of time.

  3. Register the application instance and verify that it was registered correctly.
    • If the application instance includes packages, deploy the application instance and then manage it.
    • If the application instance does not include packages, start managing the instance immediately.
  4. If you want to update definitions for your application template, modify the application instance.
  5. If required, unregister the application instance.