Software is eating the world, as the saying goes, and in the cloud, Kubernetes is eating software.
IBM is heavily invested in Kubernetes and databases are no exception to that, despite the common opinion that Kubernetes is unsuitable for stateful and/or distributed workloads.
At IBM Cloud, we run a number of Database-as-a-Service (DBaaS) products directly on top of Kubernetes; for example, Databases for PostgreSQL, Databases for Redis and the forthcoming Databases for EnterpriseDB and Databases for Datastax Enterprise. To do this in a scalable and reliable way, we've built a control plane based on the so-called "operator" pattern. This pattern has been critical to our success and, at a minimum, worth understanding for anyone considering managing stateful workloads on top of Kubernetes.
In order to understand this approach, its motivations, and its implications, let's take a quick walk through the history of stateful services.
A brief genealogy of operating stateful distributed systems
In some ways, the early days of computing—Harvard Mark I, ENIAC, etc.—were much simpler than the world of the 21st century. The machines were isolated and executed serially. The scope of an individual fault (and there were a lot of faults!) was limited to a single room. However, with the advent of online, data-driven, networked systems such as SABRE (airline resevations) and BASE (credit card processing) this began to change. State was at the core of these new systems, and the systems in turn were the core of the businesses that ran them.
The direct descendants of the SABRE and BASE systems can be clearly seen in the IBM Z mainframes and associated software, such as TPF and IMS. These are the pinnacle of the "make this computer fast and reliable" approach to engineering. Mean-time-before-failure of the hardware is measured in years and the software stack is designed top-to-bottom to support the specialized use case at hand. They're fascinating, but specialized; your average Linux hacker is not going to be at home with a TPF-based system.
On the other hand, the early data processing systems led to databases such as System R and Db2, then PostgreSQL and MySQL, and eventually the wide range of databases available to application developers in 2020. These systems—particularly the open source variants—are vastly more approachable than the mainframe world, but in many ways, their ancestry is clear—the best way to make a PostgreSQL-based system reliable is to run it on fast, reliable hardware—perhaps on a Z mainframe! By the time of the Internet boom in the late 1990s, most of the business world depended on databases like these.
In the 1990s, the consumer Internet changed everything. Perhaps most importantly, Google enters the scene and revolutionizes system architecture by playing out the implications of attempting to deliver Internet-scale software while refusing to buy reliable hardware. Bargain-basement computer hardware is slow—everything needs to be horizontally scalable—and it fails constantly, so systems need to be designed to run in limp mode. This results in the classic systems papers on Borg, Omega, Google File System, and Bigtable, etc., and their open source "mirrors" in systems, such as Kubernetes and Cassandra.
By 2020, commodity computing—"cattle, not pets"—is the dominant paradigm in software architecture. It's a rare startup that will actually purchase a computer to run production services, and fighter jets run Kubernetes. But, the vast majority of business applications on the Internet are still built on top of databases like Db2, PostgreSQL, and MySQL—systems fundamentally designed for reliable hardware, not commodity computing.
This is a problem—businesses want "the Cloud," whatever they think that means, but their software decidedly does not want the cloud.
Kubernetes operators are a technology purpose-built to solve this problem. Even "modern" databases like Cassandra will crumple under the abuse levied against them by Kubernetes, without careful orchestration. Operators provide a powerful set of abstractions, inspired by control theory and backed by the hundreds (thousands?) of human-years of engineering effort baked in to Kubernetes itself, that can be used to implement the application-specific logic and behaviors necessary to make a "pre-cloud" stateful system behave in a commodity world.
The term "operator" originated as a bit of software marketing from the team at CoreOS. It’s a useful way to capture a particular pattern, but it can be misleading. There’s nothing really novel in the idea. Resources (i.e., system specifications) and controllers (i.e., control loops) are standard concepts in industrial control systems, and they’re fundamental to how Kubernetes works. The term "operator" really encompasses just that—standard patterns from Kubernetes itself, exposed to users through built-in mechanisms, wrapped up in a catchy name.
Most Kubernetes users know what a resource is: ConfigMap, Pod, and Deployment are all resources. The state of a Kubernetes cluster is fully defined by the state of the resources it contains. A controller is a process that reifies a resource; given a particular resource, the controller responsible for it continually works to make the state of "the world" (i.e., the Kubernetes cluster) match the state declared in the resource.
If a resource changes, its controller, in turn, will mutate “the world” to reflect those changes. Most Kubernetes resources work this way. The deployment controller, for example, is responsible for ensuring that a Deployment created by a user results in the desired number of containers running a particular image in the cluster.
More abstractly, consider a thermostat. You set the temperature to 20 degrees ("desired state"), then the thermostat periodically measures ambient temperature ("actual state") and either turns on or off the furnace ("taking action") as appropriate. It may take a bit of imagination, but this pattern is enormously generalizable and powerful. Your distributed system is not all that different from a thermostat!
This video from Sai Vennam explains more about operators:
IBM Cloud Databases Operators
In the Databases group at IBM Cloud, a significant part of our job is making these "old-school" databases—PostgreSQL, Redis, MongoDB—run reliably on modern-day commodity hardware. To make this work at scale we've made a heavy investment in Kubernetes operators. The core of the control plane for the Databases portfolio is 100% operators. There's no database, even—all system state is stored directly in Kubernetes itself. There's no API, either—all interaction is through CRUD operations against custom resources in the Kubernetes API.
Practically speaking, doing this right involves a lot of operators, all over the stack. As of this writing, the control plane is implemented with well over a dozen resources and nearly as many controllers, handling everything from resource allocations and pod evictions to database backups and reconfigurations. It's important to note, too, that some of these resources have no controller—they're just representations of state—and some controllers are watching "upstream" resources in Kubernetes. In practice, the operator pattern is not as simple as it's made out to be; it's powerful, but "just write an operator" is an amusingly naive idea.
For just a taste, the IBM Cloud Databases control plane includes operators for the following:
- Mutating cgroups on hosts for custom resource allocations
- Managing the lifecycle of Cloud Object Storage buckets
- Migrating database workloads for maintenance
- Rollout of new software
- Managing TLS certificates
- Configuring firewalls
As the system evolves to meet new requirements or support more databases, we always reach for operators first—adding a new resource here, adding a new controller there, etc. In the long run, if you buy in to the pattern, you'll start to see the whole world in terms of convergence and declarative resources, and that might be a good thing.
How does this help manage stateful distributed systems?At this point, you—the thoughtful reader—might be asking where the punch line is. Operators are clearly a tool that can be used, but why should anyone take them seriously? Are they any good? Why not just use Helm, or Ansible, or any other of the myriad tools out there for putting software on Kubernetes?
The key here is in the various technologies' approach to managing state. Most tools designed for working with Kubernetes focus exclusively on the definition of the initial state of the system, leaving runtime management entirely to Kubernetes. This works for software that can be managed with a policy as simple as "turn it off and back on again," but for anything more complicated, the runtime management gets more complicated. And pre-cloud databases are very complicated—remember, they were designed for environments that basically never go down. Helm won't help when you need to force a leader switchover or resync a replica in response to flaky networks.
The operator pattern allows programmers to take direct action in the face of the chaos of a large-scale Kubernetes system. Workers crash, networks are partitioned, disks disappear, IP addresses change—and the operators are constantly working to converge the system to a known good state.
Good Kubernetes, bad Kubernetes
The ideas behind the operator pattern are straightforward, and they aren't new. What makes it a powerful option for systems running on Kubernetes is Kubernetes itself. On the other hand, what makes it a daunting option is, well, Kubernetes itself.
The core architecture of the Kubernetes control plane is based on resources and controllers, just like operators. This means that there are a number of powerful features, tools, and patterns baked in to Kubernetes that are accessible to operator developers. For example:
- Role-based access controls: Kubernetes' role-based access control mechanism is at the heart of the security posture of any application on the platform, and it extends seamlessly in to the world of operators. At IBM Cloud Databases this allows our administrators to have fine-grained control over permissions to nearly every aspect of our control plane, without writing any custom code.
- ObjectMeta: The custom resources owned by an operator include the same metadata as any other Kubernetes resource. This provides access to powerful tools like label selection and garbage collection, right out of the box.
- MVCC: It's not easy to write correct distributed systems in any environment, but Kubernetes helps—every resource, including custom resources, support MVCC semantics via resource versioning.
- Client libraries: The robustness and maturity of client-go makes writing new resources and controllers relatively easy. Operator developers have access to the same tools that are used upstream, so it’s easy to trust that they’re rock-solid and well-tested and that there are plenty of proven usage examples available in Kubernetes itself.
On the other hand, Kubernetes is a big and complicated system, and taking operators seriously—that is, not just pulling a couple of dependencies from operatorhub.io but actually using operators to run your distributed systems—means embedding part of your business in Kubernetes itself. And that's a big commitment.
Kubernetes moves fast. New minor versions of Kubernetes are released every three months and only supported for nine months once released. For most users, this isn't a tremendous burden, but when your application is embedded in Kubernetes, the costs can be high. Your operators will be tightly coupled to a small range of Kubernetes versions, and as versions change, you won't have a choice but to invest time in changing along with them.
Furthermore, deep down, Kubernetes has a fundamentally declarative view of the world. API consumers declare desired state and the system converges the world to match the declaration. In order to work effectively "inside" Kubernetes—on operators—developers must internalize this model. They must think in terms of "what" rather than an imperative "how." And this is not trivial; many developers, for whatever reason, seem to default to an imperative mental model and as a result struggle to implement operators effectively.
Don't use operators if...
As with any technology hype cycle, operators warrant a healthy dose of skepticism. They're not for every organization, or every problem. They're especially effective at managing "pre-cloud" software in a cloud native world, but many organizations don't have those problems. Asking "what problem do I have that this technology is trying to solve" is never a bad idea.
At the bottom line, if you're willing to pay the price—particularly in the high expectations for Kubernetes expertise and declarative thinking for developers working on your system—you can probably make operators work. But if you're developing a brand-new, shiny, microservices-and-service-mesh-based application on top of Kubernetes, you might want to take a harder look at other options before you reach for a tool like operators.
Would we do it again?
For Databases at IBM Cloud, absolutely. We took a risk from the start (
CustomResourceDefinitions were still
ThirdPartyResources when we started!) and it has paid off. We have a team of platform developers that deeply understand Kubernetes and operators, and this pays dividends to our business both in delivering a stable and secure platform and in our ability to match the needs of the databases we offer to the capabilities of Kubernetes itself.
As our products evolve, we're investing further in the operators ecosystem: more resources, more controllers, and more declarative APIs. We do this because we've found that operators help our team focus more or problems that matter to our customers, and in the end, that's a win for everyone involved.