Docker overview

Docker is used to build, ship, and run distributed applications. You can configure your instance group to include Dockerized services. For the supported versions of Docker with IBM® Spectrum Conductor, see Supported Docker versions.

A Dockerized service contains a Docker pod. A Docker pod can consist of one or more collocated containers. For more information on Docker, see What is Docker.

Before you can configure Dockerized services

Docker must be installed and running on a subset of compute hosts. The Docker daemon must be listening at the default Linux socket unix:///var/run/docker.sock.

The Docker controller launches and controls groups of tightly coupled Docker containers and named pods, by providing pod management logic and abstracting communication with the Docker daemon. You can find the Docker controller source code and README file at $EGO_SERVERDIR/egodocker.

Compute hosts that have Docker installed and the Docker daemon running on the supported version are considered Docker active hosts. Docker active hosts can process Docker and non-Docker workloads. Using the egosh resource list command, you can use the docker_active resource attribute to show the status of Docker hosts. For more information, see Viewing Docker host status.
Note: Docker active hosts is a dynamic property. Generally, Docker should be running on the Docker hosts. The system determines the availability of the Docker daemon on these hosts, and which hosts are the Docker active hosts.

Configuring Dockerized services when you develop your instance group

When you are developing a instance group, if you want to include Dockerized services, configure Docker in the service profile. For a list of Docker parameters and attributes, see the Service profile reference.

Docker image requirements and limitations

Using a instance group package, you can deploy the Docker image to your Docker hosts when you deploy the instance group, rather than when you start it. You can either put the Docker image into a package, which depending on the Docker operation you use, loads or imports the image from a .tar file; or you can use a package install script to pull the image from a source. You can find Docker images through a Docker registry (for example, the public Docker registry at https://hub.docker.com/).

For more information, see Service packages.

The following requirements apply to Docker images for specific operations in IBM Spectrum Conductor:
  • Within a IBM Spectrum Conductor environment, Docker does not support root squashed installations and root squash instance group deployments.
  • You must provide the Docker image for the Spark instance group. IBM Spectrum Conductor does not provide Docker images. Any Docker image that you provide is used at your own risk.
  • The Docker image must be compatible with the Docker version that is installed on your hosts to avoid unexpected Docker issues.
  • The Docker image must have the net-tools package and OpenSSL 1.0.1 or higher installed.
  • If you want to Dockerize a notebook, the Docker image must support the SS utility library. To Dockerize the Zeppelin or Jupyter notebook, the Docker image must also support cURL 7.28.0.
  • If Kerberos is installed on the instance group hosts, you must complete the following steps:
    1. Install Kerberos on the Docker image.
    2. Configure the krb5.conf file so that it matches the krb5.conf that is on the primary host.
    3. Ensure that the kinit command is located in the same directory that is specified by KINITDIR in $EGO_TOP/kernel/conf/sec_ego_gsskrb.conf on the primary host.
Note: If you enter a registry URL into the registry URL field when your system is not connected to the Internet, the Docker controller will hang and eventually time out while trying to pull the Docker image from the online registry. Reestablish an Internet connection before entering the registry URL.

After Docker is installed and the instance group is deployed

Once you have Docker installed and the instance group is deployed, you can define, start, stop, and remove services with one or more Docker containers on all Docker active hosts. EGO passes the user-defined Docker container information from the EGO service controller (egosc) to the EGO kernel daemon (vemkd). The process execution daemon (pem) runs the Docker controller that controls all of the Docker containers that are defined for a service instance. The Docker controller uses a remote API client module to interact with the Docker daemon.

When a service with Docker containers is started, a service instance is created on one Docker host. A service instance with Docker containers is also known as a Docker pod instance. In a service instance, there can be one or more collocated Docker containers, running on the same host.

If a job monitor is not defined, the service instance enters the RUN state after the Docker pod is successfully started by the Docker controller. The service instances enter the FINISH state after the Docker pod is shut down by the Docker controller.

Pod monitoring takes effect after pod startup is complete. By default, the Docker controller logs an error and initiates pod shutdown if any container stops running, unless the container is configured as short-running and has completed its job successfully. If a job monitor script is defined for the service, the Docker controller passes pod monitoring responsibility to the job monitor. In this case, the Docker controller attempts to shut down the pod only if all working containers in the pod stop running. For details on pod monitoring, see the pod monitoring section in the README file in $EGO_SERVERDIR/egodocker. For more information, and for a Docker job monitor script to print out the container names and states, see Job monitor.

Important: Take note of the following conditions when using Docker:
  • It is your responsibility to install, activate, and maintain the Docker controller and client. The Docker controller and its dependencies are shipped with IBM Spectrum Conductor and installed with the product. The exception is Python, which you must install and maintain on Docker hosts.
  • Avoid using the Docker daemon to apply container operations, such as run and stop on a host in an EGO cluster. These actions do not enable IBM Spectrum Conductor to be aware of such operations and their results, therefore generating unexpected results. Instead, follow the instructions for creating Dockerized services to configure Docker properly for IBM Spectrum Conductor.

Environment variables for Docker containers

Environment variables can be specified as key-value pairs in the Docker pod specifications in IBM Spectrum Conductor. For more information, see Environment variables for Docker containers.