Prerequisites for an instance group

Based on your requirements, you must meet certain prerequisites to create an instance group.

Permissions prerequisites

  • You must be a cluster administrator, consumer administrator, or have the Instance Groups Configure permission.
  • Assign the appropriate write permissions for the deployment directories for your instance groups and notebooks, so that the instance group and notebook administrators and workload execution users can write to these directories.
  • All users who submit workload to the instance group must have access to the $EGO_TOP/jre directory. This requirement does not apply if you customized the JAVA_HOME configuration for an instance group.
  • When you use client computers, all users must have access to the hosts on which Spark workload runs.

    Spark workload runs on non-management hosts in your cluster. As a result, the Apache Spark UI and RESTful APIs that are available from Spark applications and the Spark history server must be accessible to your end-users. This access is also required for any notebooks that you configure for use with IBM® Spectrum Conductor.

    If the hosts and the ports used are not accessible from your client machines, you can encounter errors when you access notebooks and Spark UIs. The management hosts also must be able to access these hosts and the ports used.

Execution user prerequisites

The instance group by default runs workload as its execution user. This user owns all directories and files that are created when the instance group is deployed. The owning group is, by default, the execution user's primary group.

To specify a valid execution user, you must meet the following prerequisites:
  • The execution user must have an operating system user account with write (w) permission to create the deployment directory on the hosts of the instance group.
  • If you change the default instance group consumers, the consumer execution user for each service component in the instance group is used to run workload for that service. For Spark workload specifically, the execution user for the Spark driver and executor consumer is used. If you customize the consumer execution users, the execution user for the driver and executor consumers must be the same.

    The execution user for the instance group and the execution users for consumers in the instance group must belong to the same primary group. The consumer execution user for each instance group service must belong to the same OS user group and use the same group ID (GID) as the instance group execution user.

    If you add notebooks to the instance group, the notebook services run as the execution user of the consumer associated with that notebook type. These consumer execution users must also use the same GID as the instance group execution user.

  • If you set up users on all your hosts (both management and compute hosts), the execution user must use the same user ID (UID) and GID on all the hosts. Otherwise, the instance group fails to start.
  • If your installation supports root squash deployments, then the clusters administrator OS user must be a member of the user group of every instance group execution user.
  • If you specify an administrator user group for an instance group or notebook deployment, the specified administrator user group must satisfy that the execution user of the instance group or notebook must be a member of the specified administrator user group. If not satisfied, or if the provided administrator user group is not valid, the deployment will fail.

Resource request prerequisites

When an instance group requests resources for its applications, by default, it uses the credentials of the EGO user logged in to the cluster management console (or the user that submits the REST request if you use the API to create an instance group). If this user does not have access to the required resources, you can request resources as another user who has permission to access those resources.

To assign another user whose credentials are used to request resources, you must be a cluster administrator or have the Services Assign Impersonation (Any User), Users View, Consumers View, and Roles View permissions. Also, the user account that you specify must meet the following prerequisites:
  • The user account must exist. See Creating a user account.
  • The user account must have the Consumers View, My Activities, My Allocations, and My Clients permissions.

Notebook prerequisites

Notebooks provide a web-based interface for interactive queries and data manipulation. By default, you can use the built-in Jupyter notebook with an instance group.

To use notebooks with an instance group, you must meet the following prerequisites:
  • The notebook administrator must have an operating system user account with write (w) permission to create the deployment directory on the hosts of the instance group.
  • cURL 7.28.0 or higher must be installed on all hosts that run the notebooks.
  • Some functions of the Jupyter notebook require that the notebook workload execution user have an existing home directory. Ensure this home directory exists before running Jupyter notebook workload.
  • If you have a local environment with a mixed cluster that uses both Linux and Linux on POWER, the Jupyter notebook packages for Linux must be in a different resource group than the ones for Linux on Power, since they are different.
  • To use the built-in Jupyter notebook, the resource group for this notebook must contain only Linux® hosts.
  • Notebooks other than the built-in notebooks must be added to the cluster, along with any notebook packages.
  • To run notebooks inside Docker containers, a Dockerized notebook must be added to the cluster. The Docker image that you provide for the notebook must support the iproute package (that provides the SS utility). To Dockerize the Zeppelin or Jupyter notebook, the Docker image must also support cURL 7.28.0.
  • If an instance group uses different Anaconda or Miniconda distribution instances, the deployment directories must be the same and the environments within must have the same environment name.
For information on creating notebook packages and adding notebooks to the cluster, see Notebooks.

Consumer prerequisites

Each instance group is assigned a top-level consumer. When the instance group is created, new consumers are created by default under this top-level consumer for the core components of the instance group: Spark drivers, executors, and batch master. Depending on your configuration, it might also include the shuffle, notebook, and history services.

The default top-level consumer is a consumer with the same name as the instance group (for example, if your instance group name is ABC, then the default top-level consumer is /ABC). The top-level consumer represents the entire cluster and all its resources. The default resource group for an instance group is the ComputeHosts resource group.
Remember: For creating instance group in LSF mode, your cluster must have ManagementHosts and ComputeHosts resource groups defined.
To change the default consumer, you must meet the following prerequisites:
  • The consumer or its parent must be associated with resource groups (or resource plans for multidimensional scheduling).
  • To use different consumers for each component in the instance group, the execution users for those consumers must belong to the primary group of the instance group execution user.
  • If you want to use different resource groups for Spark executors and the Spark shuffle service, both resource groups must use the same host list. Otherwise, applications that are submitted to the instance group fail.
  • It is recommended to use dedicated resource groups for the Spark batch master service and the Spark notebook master service. If required, create new resource groups so that you can specify different resources for both of these components. This setup helps avoid resource competition within the instance group.
If you choose to use multidimensional scheduling, you create multidimensional resource plans, each of which is associated with one or more resource groups. The multidimensional resource plan that you select must meet all these requirements.

Multidimensional scheduling prerequisites

When an instance group is created, its allocations are based on the resource distribution plan for slot-based scheduling.

To use multidimensional scheduling, where each resource allocation can request different amounts of physical resource types such as CPU, memory, and number of disks, you must create multidimensional resource plans. The multidimensional resource plan must be configured for slot mapping.

To create resource plans for multidimensional scheduling, see Flow to configure multidimensional scheduling.

Docker prerequisites

You can enable Spark drivers, executors, and services in an instance group to run within Docker containers.

To Dockerize the instance group, you must meet the following prerequisites:
  • You can only user Docker with certain Spark versions. Spark versions not supported: 1.5.2 and 3.0.0.
  • Docker must be installed on a subset of your compute hosts. For a list of supported Docker versions, see Supported Docker versions.

    When the Docker daemon is running on a host, the host is considered a Docker active host.

  • If you customized the JAVA_HOME setting to use the operating system Java, the destinations of soft link files in the JRE lib directory must be mounted in the Docker service.
  • A suitable Docker image for the instance group must be available.
    Note: IBM Spectrum Conductor does not provide Docker images. While you can use default Docker images (such as ubuntu), Docker images must be used at your own risk and must meet the following requirements:
    • The Docker image must be compatible with the Docker version that is installed on your hosts to avoid unexpected Docker issues.
    • The Docker image must have OpenSSL 1.0.1 or higher installed.
    • The Docker image must have the net-tools package.
    • If you want to Dockerize a notebook, the Docker image must support the iproute package (that provides the SS utility library). To Dockerize the Zeppelin or Jupyter notebook, the Docker image must also support cURL 7.28.0.

    If you provide your own Docker image from a local directory, you must load the Docker image by using the docker load command. Ensure that you load the Docker image to all hosts on which you want the Docker container to run.

    You can upload your Docker image as an instance group package. In this case, the Docker image is deployed to Docker hosts when you deploy the instance group, rather than when you start it. You can place the Docker image in a package, which depending on the Docker operation you use, loads or imports the image from a .tar file. Alternatively, you can use a package install script to pull the image from a source. Find Docker images through a Docker registry (for example, the public Docker registry at https://hub.docker.com/).

Dependent package prerequisites

If applications that run in the instance group require dependent packages, you must create those packages and upload them to the instance group.

When you upload a package during the process of creating an instance group, the package is added to the repository when the instance group is deployed. To reuse this package with other instance groups, you can select the package from the repository when you create those instance groups. The package name though is tied to the name of the instance group you first created. Alternatively, you can upload a dependent package directly to the package repository and select the package when you create an instance group.

To add dependent packages to an instance group, you must meet the following prerequisites:

GPU prerequisites

To allocate GPU resources for applications in the instance group, you must enable instance groups to use GPU resources and meet other prerequisites. For more information, see GPUs.

Dask on GPU support is provided by conda packages which are developed by RAPIDS. Refer to https://rapids.ai/start.html for RAPIDs prerequisites.