An “operator pattern” experience report
IBM Cloud Databases, the recently-launched, next-generation DBaaS platform on IBM Cloud, has been built from lessons learned running tens of thousands of databases worldwide on IBM Compose. IBM Cloud Databases has been architected top-to-bottom as a set of “operators” running in IBM Cloud Kubernetes Service clusters. This document is a report of the experience of building a production-quality platform for managing stateful distributed systems using the so-called operator pattern.
IBM Cloud Databases
IBM Cloud Databases (ICD) is a fully-managed collection of open-source databases available through consistent consumption, pricing, and interaction models. While each database is delivered and consumed independently of others in the portfolio, the platform is built and managed by a single team, which is key to delivering a consistent experience across a diverse set of databases. This is not the place for an in-depth overview of the ICD product; to learn more, see here.
The IBM Cloud Databases data plane
Abstractly, the functional requirements for the ICD data plane are simple: deploy
X instances of
Y open-source databases across
P hosts in an always-on, always-secure fashion. The ICD data plane must manage the full lifecycle of these databases—deployment, backup, update, failover, etc.
The ICD architects decided early on to design the system to run on Kubernetes, turning this abstract set of requirements into a concrete set of requirements—to create and manage a gaggle of Kubernetes resources, e.g.:
Some readers familiar with Kubernetes may be tempted to reach for Helm here—after all, Helm is designed for managing Kubernetes resources—but anyone with Helm experience will know that it struggles when it comes to managing the lifecycle of those resources. Trying to use Helm to solve them would be akin to writing a database in BASIC; it could certainly be done, but not without productivity-crushing impedance mismatches between the abstractions.
Instead, ICD goes “direct,” managing Kubernetes resources directly by communicating with the Kubernetes API. This provides far more control over the lifecycle of the resources under our purview, which is ultimately critical for the success of our platform.
The decision to interact directly with the Kubernetes API does not logically necessitate the operator pattern, however. At this point, it would have been perfectly reasonable to build a traditional RPC-style architecture to solve our problems. Operators clearly meshed nicely with our meta-goals though (e.g., building and maintaining less software), so we decided to take the plunge during initial prototyping.
A reader unfamiliar with this “operator pattern” may at this point be wondering what all the fuss is about, so let’s briefly review.
What is an Operator?
The term “Operator” originated as a bit of software marketing from the team at CoreOS. It’s a useful term to capture a particular pattern, but it can be misleading. There’s nothing really novel in an “Operator.” Resources (i.e., system specifications) and controllers (i.e., control loops) are standard ideas in industrial control systems, and they’re fundamental to how Kubernetes works. The term “Operator” really encompasses just that—standard patterns from Kubernetes itself, exposed to users through built-in mechanisms, wrapped up in a catchy name.
Resources and controllers
Most Kubernetes users know what a resource is—
Deployment are all resources. The state of a Kubernetes cluster is fully defined by the state of the resources it contains. A controller is a process that reifies a resource; that is, given a particular resource, the controller responsible for it continually works to make the state of “the world” (i.e., the Kubernetes cluster) match the state declared in the resource.
If a resource changes, its controller, in turn, will mutate “the world” to reflect those changes. Most Kubernetes resources work this way. Deployment controller, for example, is responsible for ensuring that a
Deployment created by a user results in the desired number of containers running a particular image in the cluster.
In addition to the “traditional” resources referenced above, Kubernetes supports a “special” resource—
CustomResourceDefinition (CRD)—that allows users to extend the Kubernetes API with objects from their application domain. With a CRD in place, users gain access to a significant subset of Kubernetes API functionality, such as CRUD, RBAC, lifecycle hooks, and garbage collection.
The operator pattern pairs a CRD with a custom controller, thereby mimicking the architecture of Kubernetes itself. Clients interact with the system by CRUDing custom resources; the application “watches” for these interactions and takes action against the system.
ICD data plane operators
The IBM Cloud Databases (ICD) data plane is 100% “operators.” There’s no database in the data plane; all system state is stored in the Kubernetes-managed
etcd. There’s no direct API in the data plane; all interaction works through CRUDing custom resources in the Kubernetes API.
ICD does, of course, expose an API to customers, and it does not, of course, allow customers direct access to Kubernetes. This interaction exists within the ICD control plane, which does a “translation” from imperative/RPC interactions with the outside world to declarative/CRUD interactions with the data plane’s Kubernetes clusters.
Resource and controller diversity
The ICD data plane is currently composed of six custom resources and 10 custom controllers. Some of the resources have no controller—they exist to manage application state. Some controllers are watching “upstream” resources that ICD doesn’t own. This is an important design point. The “operator pattern” moniker is an oversimplification; traditional software engineering practices (e.g., the single responsibility principle) still apply here. Implementing application logic in a resource + controller architecture isn’t always going to result in a simple pairing between the two. This is true, too, for Kubernetes itself—many of the upstream controllers are watching several resources, including resources they don’t “own” directly. Deployment controller is, again, exemplary here.
Furthermore, the behavior implemented by the controllers is diverse. The simplest sort of controller is one that exists as part of a graph of Kubernetes resources. The application logic in the controller translates CRUD operations on its resources to CRUD operations on other resources, which perhaps result in CRUD operations on other resources, and so on. Some ICD controllers behave this way, but most do not. For example, ICD includes controllers responsible for translating resource CRUD into host-level
cgroup mutations and managing the lifecycle of Cloud Object Storage buckets and their access policies.
Again, it’s important to think about decomposition when designing an application built with the “operator pattern.” Doing so effectively requires a solid grasp of both Kubernetes fundamentals and a declarative model of thinking. The vast majority of software engineers, unfortunately, have neither.
IBM Cloud Databases operators: what has worked well
The constraints of the operator pattern and the declarative model have forced the team to think carefully about how the layers in the ICD control planes are decomposed and how responsibilities are distributed amongst them. This has resulted in a more robust set of abstractions than would have been realized without these constraints. In other words, it’s a lot harder to accidentally build a ball of mud when there are guard rails around how your system must operate.
Role-based access controls
Kubernetes ships with a sophisticated role-based access control mechanism that, on its own, is a powerful tool. When application logic is embedded in Kubernetes via resources and controllers, it’s invaluable. ICD administrators have fine-grained control over nearly every aspect of the running system. This is a tremendous benefit for a service that takes runtime security seriously. Replicating this level of access controls from scratch would be a nontrivial—and risky—development effort.
Custom resources in Kubernetes integrate natively with the advanced resource lifecycle events used by the upstream controllers. Owner References, for example, eliminate an entire class of complex and error-prone resource deletion functions—think
free in a garbage collector—and again, the less software ICD needs to build, maintain, and operate, the better.
It’s not easy to get concurrent distributed systems correct. Kubernetes’ core persistence architecture—a strongly consistent store with reasonably high-performance APIs layered on top—allows controller authors to build correct distributed behavior without much fuss. The MVCC semantics of the
resourceVersion field in Kubernetes resources (including custom resources) have paid dividends for the ICD team both in reasoning about local semantics around concurrent object mutations and in constructing higher-level concurrency primitives.
The robustness and maturity of the default Kubernetes API client—
client-go —makes writing new resources and controllers relatively easy. Users get access to the same code generation functionality, APIs, and behaviors that are used in the upstream controller manager, so it’s easy to trust that they’re rock-solid and well-tested and that there are plenty of proven usage examples available in Kubernetes itself.
This is something that can’t be taken lightly; the Kubernetes API is complex, and it’s not always obvious how to use it in a way that is both correct and high-performance.
client-go makes this relatively straightforward. Other Kubernetes clients—even “official” ones such as
client-python—don’t come close.
Upstream controller examples
Controllers aren’t always simple, even with good building blocks like
client-go. Fortunately for controller authors, the Kubernetes source is a gold mine of well-documented, battle-hardened controllers. Some (e.g.,
CronJob) are in fact simple, while others (e.g.,
StatefulSet) are far more complex than any custom controller. When in doubt, the ICD team looks upstream for guidance.
IBM Cloud Databases operators: what hasn’t worked well
Too much Kubernetes
Kubernetes is a complex system. Embedding application logic in Kubernetes pushes that complexity onto the ICD team’s day to day work. Ultimately, this means that ICD developers need to know quite a lot about Kubernetes in order to work on the system. This isn’t an issue per se—in fact, it might be a boon in the long run—but it’s a hidden development cost that can’t be taken lightly.
As discussed above, Kubernetes’ consistency semantics and API clients are powerful tools, but they don’t mask the essential complexity of the problems they exist to solve. When an application is embedded in Kubernetes via the operator pattern, that complexity leaks into everyday development.
Beyond development costs, this presents several further issues:
As with any distributed systems development, it’s easy to get things wrong and not trivial to ensure their correctness. Concurrency bugs in production can, and likely will, result.
It’s difficult to estimate the performance cost of a given Kubernetes API operation. As the ICD control plane evolves, care must be taken not to overload the Kubernetes masters; without an understanding of the cost of operations, this is very difficult. Overload conditions in production can, and likely will, result.
Kubernetes moves quickly, with new major releases on a quarterly basis. While the community is very good at managing change in a mature fashion, change does happen. By increasing coupling between application and Kubernetes, operators increase the risk and burden of managing that change over time.
Declarative vs. imperative semantics
Kubernetes is fundamentally declarative; that is, API consumers declare desired state and the system works to “converge” the world to match the declaration. In order to use it effectively, developers must “think declaratively”—i.e., they must think in terms of “what” rather than “how.” By contrast, most developers seem to default to an imperative mindset. It takes some time to wrap one’s head around the declarative model and work effectively with it.
This challenge compounds in an operator-based architecture. Instead of merely consuming declarative APIs, developers must take on the much more difficult task of writing them. At best, this represents another hidden development cost as engineers take the time to understand the problem space; at worst, it can result in major mis-features and technical debt.
Where to from here?
The IBM Cloud Databases team is, on the whole, happy with the decision to implement an operator-based architecture and would make the decision again. As the product evolves, we’re investing further: more decomposition, more resources, more controllers, and more declarative APIs.
Done right—and to be clear, it’s not easy to do it right—operators support a software delivery model heavy on ephemeralization, which in turn allows a team to focus more on the problems that matter to customers. And that’s a win, in the end.