Part 2 of the blog series on Kubernetes operators.

In Part 1 of this blog series, we introduced the idea that Kubernetes operators—when deployed at significant scale—can consume substantial resources, both in terms of real resource consumption and consuming schedulable capacity. We also introduced the idea that serverless technology may play a part in reducing this impact on a Kubernetes cluster by scaling down active controller deployments when they go idle. In this post, we introduce a technology that is capable of reducing the resource overhead of existing controllers without source modification, based simply around the idea of scaling the number of pod instances to zero when idle.

For a quick review, see also “Kubernetes Operators Explained”:

Scaling controllers to zero

Beyond the core Kubernetes platform, most operators and other general purpose controllers are deployed using either Deployments or StatefulSets. Both of these constructs include the ability to set the scale to a specific value; the Kubernetes platform then adds or removes pods to achieve the desired value. However, scaling controllers beyond one instance often only provides redundancy. This is due to built-in consensus checking that ensures that controller pods do not interfere with each other. The following represents a deployment typical to many controllers and operators:

If we were to set the scale on such a deployment to 0, the Kubernetes controller manager would terminate any running pods, leaving us without any active controller instances to process resource events. In making this scale change, we are, in effect, disabling event processing for the current controller.

In the simpliest case, no resource modifications occur while the controller is stopped, and the controller scale is restored before the watched resources are modified. In this case, simply setting the deployment scale to a scalar value greater than zero will restore the controller to it’s prior state. But what about the case where resource modifications occur while the controller is stopped?

Reconciliation in Kubernetes is built around a concept called “level triggering.” In a level-triggered system, reconciliation occurs against the entire state, rather than being dependent upon the individual events or the ordering of those events that occured since the last reconciliation. When scaled back up, our controller will simply look at the resources it intends to watch and reconcile their state into the target resources, regardless of how many individual changes occured in the interim. To learn more about level triggering in Kubernetes, check out James Bowes’ post, “Level Triggering and Reconciliation in Kubernetes.”

Scaling to zero automatically

If Kubernetes controller deployments are tolerant to both scale-to-zero and back up again, can this be done automatically based upon real activity? Absolutely, and this is the goal of the controller-zero-scaler.

The controller-zero-scaler is, itself, a Kubernetes controller that watches for Kubernetes API activity, automatically scales down controllers once they become idle, and later restores scale once relevant resource modifications occur. Since it is driven fully by annotations on individual controller deployments, it is possible to enable the zero-scale-controller in existing Kubernetes deployments without source modifications.

Figure 2 shows how the controller-zero-scaler works against a running controller deployment.

When it starts, the controller-zero-scaler begins watching for deployments which have a set of annotations. These annotations identify the deployment as a controller that the controller-zero-scaler should act upon. Once a deployment has been identified as being under management, the controller-zero-scaler begins watching for API server activity that is relevant to that controller. Once no resource modifications occur for a time, that individual controller is determined to be idle, and its scale is set to zero.

In the meantime, the controller-zero-scaler continues watching for any Kubernetes API server activity that needs to be processed by the controller. If a resource change does occur, the scale is restored, which will reactive the controller pods. The net effect is that within a few moments after an action such as ‘kubectl apply’ occurs, the downstream resource modification will be completed.

Let’s take a look at this in action with an example using the Istio Operator from Banzai Cloud. We will perform the following sequence:

  1. Install the Istio Operator.
  2. Install the controller-zero-scaler.
  3. Annotate and observe the Istio Operator at zero scale.
  4. Create an Istio resource and observe the Istio Operator scale up and process the resource modification.

First, the Istio Operator will be installed by cloning the project and utilizing the makefile to install the related Custom Resource Definitions (CRDs) and the StatefulSet that deploys the controller pods:

git clone
cd istio-operator
make deploy

Let’s verify that this was deployed successfully by looking at the running instance count (which should be 1). You may need to wait a few moments for the controller to activate since the images need to be pulled first.

kubectl get statefulsets -n istio-system istio-operator-controller-manager
NAME                                DESIRED   CURRENT   AGE
istio-operator-controller-manager   1         1         8s

Now, let’s deploy the controller-zero-scaler. Since the Docker image is not yet publicly available, we will need to build the image first.

Once again, we will verify that the controller deployment has, in fact, started:

kubectl get deployments -n controller-zero-scaler controller-zero-scaler
controller-zero-scaler   1         1         1            1           21s

Now, let’s see the automatic scaling in action. First, we need to enable zero scaling on this particular controller, which we will do using a set of annotations.

kubectl annotate -n istio-system statefulset -l \'1.0' \
    controller-zero-scaler/idleTimeout='30s' \
    controller-zero-scaler/watchedKinds='[{"apiVersion": "", "Kind": "Istio"}]'
statefulset.apps/istio-operator-controller-manager annotated

This adds two annotations that are relevant to the zero scale activity:

  • idleTimeout: Defines how quickly the controller will be determined to be idle. In this case, we will need to wait at least 30 seconds before observing the current state of the Istio controller
  • watchedKinds: Indicates which API objects are meaningful to this controller. In the case of the Istio Operator, it is interested in a custom resource definition with the name ‘Istio.’

After waiting at least 30 seconds, you should see that the Istio controller pods have now stopped:

sleep 30 && kubectl get statefulsets --all-namespaces
NAMESPACE      NAME                                DESIRED   CURRENT   AGE
istio-system   istio-operator-controller-manager   0         0         2m36s

So far, we have successfully scaled the Istio controller to zero. Now let’s see what happens when an Istio resource is changed. Let’s use the provided sample from the istio-operator directory:

kubectl apply -f config/samples/istio_v1beta1_istio.yaml

We can now verify the scale-up worked successfully by looking at the pod count for the Istio controller. We can also check the downstream operator actions have occurred. In the case of the Istio Operator, some Custom Resource Definitions (CRDs) will be installed (as well as several deployments).

# Here we should see 1 available as long as we do not wait too long!
kubectl get deployments -n controller-zero-scaler controller-zero-scaler
controller-zero-scaler   1         1         1            1           7m33s

# There should be 55 CRDs as well as 8 deployments
kubectl get crd | grep -i | wc -l
kubectl get deployments --all-namespaces | grep istio | wc -l

Part 3 of the blog series

The controller zero scale solution is great for existing controller implementations since it can be enabled on an individual cluster without any source modifications. This means that you can take off the shelf operators and, with the correct annotations, see an immediate benefit.

Another serverless technology that has broad appeal well beyond operators and Kubernetes controllers is Knative. In the final blog post in this series, we will explore how Knative events—which use the Kubernetes API Server as an event source—can be used as a basis for building Kubernetes controllers and operators.


More from Cloud

Kubernetes version 1.28 now available in IBM Cloud Kubernetes Service

2 min read - We are excited to announce the availability of Kubernetes version 1.28 for your clusters that are running in IBM Cloud Kubernetes Service. This is our 23rd release of Kubernetes. With our Kubernetes service, you can easily upgrade your clusters without the need for deep Kubernetes knowledge. When you deploy new clusters, the default Kubernetes version remains 1.27 (soon to be 1.28); you can also choose to immediately deploy version 1.28. Learn more about deploying clusters here. Kubernetes version 1.28 In…

Temenos brings innovative payments capabilities to IBM Cloud to help banks transform

3 min read - The payments ecosystem is at an inflection point for transformation, and we believe now is the time for change. As banks look to modernize their payments journeys, Temenos Payments Hub has become the first dedicated payments solution to deliver innovative payments capabilities on the IBM Cloud for Financial Services®—an industry-specific platform designed to accelerate financial institutions' digital transformations with security at the forefront. This is the latest initiative in our long history together helping clients transform. With the Temenos Payments…

Foundational models at the edge

7 min read - Foundational models (FMs) are marking the beginning of a new era in machine learning (ML) and artificial intelligence (AI), which is leading to faster development of AI that can be adapted to a wide range of downstream tasks and fine-tuned for an array of applications.  With the increasing importance of processing data where work is being performed, serving AI models at the enterprise edge enables near-real-time predictions, while abiding by data sovereignty and privacy requirements. By combining the IBM watsonx data…

The next wave of payments modernization: Minimizing complexity to elevate customer experience

3 min read - The payments ecosystem is at an inflection point for transformation, especially as we see the rise of disruptive digital entrants who are introducing new payment methods, such as cryptocurrency and central bank digital currencies (CDBC). With more choices for customers, capturing share of wallet is becoming more competitive for traditional banks. This is just one of many examples that show how the payments space has evolved. At the same time, we are increasingly seeing regulators more closely monitor the industry’s…