Docker containers are nothing short of a game changer! Their wide and fast adoption shows how powerful and useful they can be. They allow users to deploy applications in any environment (e.g., cloud, cluster, virtualized, bare metal, etc.) much faster and more efficient than virtual machines. IBM has just released the publicly available Container Cloud—a Docker based technology— on Bluemix that enables users to run Docker images and applications on bare metal hardware. The release of this technology brings a new set of interesting problems, which keeps IBM Researchers (like myself) challenged and motivated!
As more companies look to adopt this technology, a pressing need to run these containers across multiple distributed infrastructure arises. For example, a single-site solution cannot:
- Accommodate for cloud-bursting (scale out when on-premise resources are used up)
- Provide resilience against zone or data center outages
- Ensure privacy (keep private data in house, other services outside)
- Allow for in-situ data analyses (minimizing data transfers and keeping container services near data sources).
In addition, relying on a single provider can lead to “provider lock-in” and restrict users from taking advantage of a competitive pricing market. The current efforts are focused on deploying containers at a single site/cloud level (e.g., Kubernetes, Mesos and Marathon, Amazon’s EC2 Container Service, and IBM Bluemix). We here at IBM Research believe that deploying containers across federated infrastructures is the future, and we are not alone in seeing the potential of this concept. Google has proposed a similar idea in the Ubernetes project – an extension to Kubernetes that would enable the deployment of containers across K8S clusters. The concept is interesting and takes advantage of Kubernetes’s power, but it is currently still at the proposal stage.
Our summer intern Moustafa AbdelBaky (a PhD candidate at the Rutgers Discovery Informatics Institute at Rutgers University) was able to turn the concept of deploying containers across multiple sites/cloud levels into a reality. He successfully designed and implemented the first platform of its kind that enables the deployment of Docker containers across multiple clouds and hybrid infrastructure!
C-Ports (pronounced seaports), as we call it, has already been demonstrated to effectively deploy containers across 5 clouds (Bluemix, Amazon AWS, Google Cloud, Chameleon, and FutureSystems) and 2 clusters (one at IBM and another at Rutgers University) in order to create a dynamic federation. Additionally, C-Ports is not tied to a specific container scheduler, i.e., it can work with any local container scheduler, such as Kubernetes or Bluemix, or directly deploy containers on the given resource/cloud, thereby increasing its portability and flexibility. In the rest of this blog, we will present C-Ports while highlighting the challenges associated with running containers in a multi-cloud/multi-datacenter environment.
The research challenges associated with achieving this vision can be classified into three categories: i) resource discovery and coordination, ii) container scheduling and placement, and iii) dynamic adaptation. We aim to address these challenges by separating the resource selection from the container placement. Moustafa and I worked together to develop a constraint-programming model that can be used for dynamic resource discovery and selection. This allows both users and resource providers to have better control over the selection process and define fine-grained workload objectives and requirements. For example, some of the constraints that we have been able to accommodate include availability, cost, performance, security, or power consumption. We have also included the ability to easily add or remove new/existing constraints without modifying the rest of the system. Once resources that satisfy all constraints at a given time are selected, they are exposed to the scheduler, which can then make the decision on where containers should be deployed. The entire process is continuously repeated to allow for dynamic adaptation.
Powered by CometCloud
C-Ports was built on top of the open source project CometCloud. CometCloud is a software designed and developed at Rutgers University that allows software-defined federation of cyberinfrastructure. This summer, Moustafa extended his previous work on CometCloud to allow for the use of Docker containers. The overall architecture of CometCloud is shown below:
Central to the CometCloud architecture, and the key mechanism used to coordinate different aspects of federation and application execution, is the Comet coordination space. Specifically, we define two types of spaces. First, we have a single federation management space for creating the actual federation and orchestrating different resources. This space is responsible for exchanging operational messages used to discover resources, announce changes at a site, route users’ requests to appropriate sites, and initiate negotiations to create ad-hoc execution spaces. Second, we have multiple shared execution spaces that are created on demand to satisfy the computational needs of applications. A federation agent creates and controls each shared execution space and coordinates the resources that execute a particular set of tasks on that site. Agents can act as a master of the execution or delegate this duty to a dedicated master (M) when more complex workflows are executed. Additionally, agents deploy workers to perform the actual execution of tasks. These workers can be in a trusted network, be part of the shared execution space and store data, or they can be part of external resources, and therefore in a non-trusted network. The first type of worker is called a Secure Worker (W) and can pull tasks directly from the space. The second type is called an isolated worker (IW) and cannot directly interact with the shared space. Instead, isolated workers depend on a proxy (P), and a request handler (R), to obtain tasks from the space.
Architecture of C-Ports
The overall architecture of C-Ports is shown below. The constraint programming solver is responsible for enforcing constraints and selecting resources that meet these constraints. The scheduler is responsible for deploying the current workload of containers on a set or subset of available resources based on different optimization objectives.
We can use any scheduler for the actual placement of containers, for example we can use the hybrid cloud placement algorithm that I published awhile ago. The solver is exposed using a RESTful API to allow for easy integration with different schedulers. The scheduler communicates with CometCloud using the CometCloud Web Service which provides a second RESTful API for the operation of the federation. The CometCloud Federation Execution Engine is responsible for managing different CometCloud agents. The Container Deployment Enactor is composed of CometCloud workers that can deploy containers in different environments. Workers can deploy containers directly on any allocated resource or interact with the local scheduler to submit container jobs.
Putting it all together
In order to demonstrate the effectiveness of this approach let us take two use cases and see how they would work in C-Ports. In the first use case let us consider cloud-bursting where a user would like to deploy containers on a local cluster as long as its utilization is below a certain threshold (e.g., 80%) and burst to a cloud (e.g., IBM Bluemix) when that threshold is exceeded. The constraint given to the solver will be (utilization < 80). The actual utilization of the local cluster is continuously monitored and supplied to the solver, additionally the utilization of a cloud is assumed to be infinite. Therefore, as long as the threshold of the local cluster is less than 80% the solver will always return the available resources to use as both the local cluster and Bluemix. Given this information, the scheduler (optimizing for cost and considering the local cluster cost to be zero) will pick the local cluster to deploy containers. The scheduler notifies the CometCloud Federation Engine and the engine starts an agent on the local cluster and deploys containers there. As soon as the utilization of the system exceeds 80%, the solver will return the set of available resources to include Bluemix only. The scheduler will then notify the federation engine to stop using the local cluster and to use Bluemix instead. The engine will disable the local cluster agent and terminate any running containers there. The engine will also start an agent on Bluemix and start deploying containers there. Note – any containers that have not yet been deployed will automatically be moved to Bluemix.
In the second use case, let us consider the scenario where the user would like to run across multiple clouds and datacenters. The selection criteria for clouds is based on cost per hour, which can be static (e.g., reserved price) or dynamic (e.g., spot price), whereas the selection criteria for datacenters is based on power consumption threshold (e.g., 60%). The objective of the scheduler in this case is to maximize throughput (i.e., deploying as many containers as possible). The constraints to the solver will be (cost < 0.1 and power < 60). The actual cost of different clouds and the power consumption rate for the datacenters are continuously monitored and also provided to the solver. The solver constantly evaluates the constraints against the properties of the available systems and reports to the scheduler which resources satisfy the given constraints. Note that the constraints can be altered at anytime during execution (e.g., the cost can be changed to cost < 0.2). Similar to the first use case the scheduler notifies the federation agent to start/stop agents on the proper resources and the containers are deployed accordingly.
In conclusion, we have presented our approach for deploying containers in a federated environment. C-Ports is an initial prototype, and there is a lot more work to be done! For example, we plan to extend this work to support container migration or traffic and data rerouting. We are hoping C-Ports opens the door to community collaboration with CometCloud as its underlying open source foundation. This project has been an exciting and insightful experience, and we hope that our experience generates a much needed discussion and drives the state of the art forward.
—Merve and Moustafa
We would like to thank Dr. Malgorzata Steinder from IBM Research and Prof. Manish Parashar from Rutgers Discovery Informatics Institute for their contribution. We also would like to thank the CometCloud team for their support.
Share this post: