July 23, 2019 By Shikha Srivastava 7 min read

While the 12-factor application guidelines are spot-on, there are another 7 factors are equally essential for a production environment

The 12-factor application provides a well-defined guideline for developing microservices and is a commonly used pattern to run, scale, and deploy applications. In the IBM Cloud Private platform, we follow the same 12-factor application guidelines for developing containerized applications. See Michael Elder’s blog post “Kubernetes & 12-factor apps” to go through how we apply 12-factor towards Kubernetes model of container orchestration.

As we reflected on the principles of developing containerized microservices running in Kubernetes, we found that while the 12-factor application guidelines are spot-on, the following 7 factors are equally essential for a production environment:

  • Factor XIII: Observable
  • Factor XIV: Schedulable
  • Factor XV: Upgradable
  • Factor XVI: Least privilege
  • Factor XVII: Auditable
  • Factor XVIII: Securable
  • Factor XIX: Measurable

Let’s discuss what each means and why it is necessary to consider the additional factors.

Factor XIII: Observable

Apps should provide visibility about current health and metrics

Distributed systems can be a challenge to manage because multiple microservices work together to build an application. Essentially, many moving parts need to work together for a system to function. If one microservice fails, the system needs to detect it and fix it automatically. Kubernetes provides great capabilities to rescue, such as readiness and liveliness probes.

Readiness probe

Kubernetes uses readiness probes to ensure the application is ready to accept traffic. If a readiness probe starts to fail, Kubernetes stops sending traffic to the pod until the readiness probe returns a success status.

For example, you have an application composed of three microservices: frontend, business logic, and databases. For this application, your frontend should have a readiness probe to check if business logic and databases are ready before accepting traffic. 

See in the following animated image that no request is sent to the application instance until the readiness probe returns success:

You can use HTTP, Command, or TCP probe, and you can control probe configurations. For instance, you can specify how often they should run, what the success and failure thresholds are, and how long to wait for responses. There is one very important setting that you need to configure when using readiness probes, which is the initialDelaySeconds setting. Ensure the probe doesn’t start until the app is ready—if not set correctly, the application restarts itself constantly. See the following YAML snippet:

readinessProbe:
# an http probe
httpGet:
path: /readiness
port: 8080
initialDelaySeconds: 20
periodSeconds: 5

Liveliness probe

Kubernetes uses liveliness probes to check if your application is alive or dead. If your application is alive, then Kubernetes leaves it alone. If your application is dead, Kubernetes removes the pod and starts a new one to replace it. This validates the need for microservices to be stateless and disposable (Factor X). See the following animated image where Kubernetes restarts the pods once the liveliness probe fails:

A great benefit to using these probes is that you can deploy your applications in any order, without worrying about dependencies.

Custom metrics

However, we found that the probes are not enough for a production environment. The applications usually have application-specific metrics that need to be monitored. Users set up threshold and alerts for these application-specific metrics (e.g., transactions per seconds).

IBM Cloud Private fills this gap with a completely secure monitoring stack comprised of Prometheus and Grafana enabled with a role-based access control model.

Prometheus scrapes targets from the metrics endpoint. Your application needs to define the metrics endpoint by using the annotation. See the following:

prometheus.io/scrape: ‘true’

Prometheus then discovers the endpoint automatically and scrapes metrics from it as shown in the following animated image:

Factor XIV: Schedulable

Applications should provide guidance on expected resource constraints

Let’s say that management picks your team to experiment with a project on Kubernetes. Your team works hard setting up the environment, and you end up with an application that is running with exemplary response time and performance. Another team then follows your lead—creates their application and hosts in the same environment. When the second application goes live, the original application starts experiencing performance degradation. When you start to troubleshoot, the first place to look is the compute resource assigned (CPU and memory) to your containers. It’s very likely that your containers are starving for compute resources, and that leads into the question of how you can ensure compute resources for your applications.

Kubernetes has a great capability that allows you to set request and limits for the containers. Requests are guaranteed. If a container requests a resource, Kubernetes only schedules it on a node that can give it that resource. Limits, on the other hand, ensure a container never goes above a certain value.

See the YAML snippet below for setting compute resource:

Resources:
requests:
memory: “ 64Mi”
cpu: “150m”
limits:
memory : “64Mi”
cpu : “200m”

Another effective capability for administrators in a production environment is setting quota for namespaces. If a quota is set, Kubernetes does not provision containers that do not have request and limits set in that namespace. In the following image, resource quota is set for namespaces:

Factor XV: Upgradable

Apps must upgrade data formats from previous generations

Security or feature patches are often needed for applications running in production, and it is important for production applications to upgrade without service disruption. Kubernetes provides rolling updates for applications to upgrade with no service outage. With rolling updates, you can update one pod at a time without taking down the entire service. See the following animated image of a second version of an application, which can be rolled out with no downtime:

See the following YAML snippet:

minReadySeconds: 5
strategy:
# indicate which strategy
# we want for rolling update
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 1

Pay attention to maxUnavailable and maxSurge when enabling rolling update strategy.

maxUnavailable is an optional field that specifies the maximum number of Pods that can be unavailable during the update process. Though its optional, you want to set the value to ensure service availability.

maxSurge is another optional (but critical) field that tells Kubernetes the maximum number of pods that can be created over the desired number of pods.

Factor XVI: Least Privilege

Containers should be running with the least privilege

Not to sound pessimistic, but you should think of every permission you allow in your container as a potential attack, as seen in the next image. For example, if your container is running as root, then anyone with access to your container can inject malicious process into it. Kubernetes provides Pod Security Policies (PSP) that you can use to restrict access to your filesystem, host port, Linux capabilities, and more. IBM Cloud Private provides a set of out-of-the-box PSPs that can be associated when provisioning containers in a namespace.

See more details on using namespaces with Pod Security Policies.

Factor XVII: Auditable

Know what, when, who, and where for all critical operations

Auditability is critical for any actions performed on Kubernetes clusters or at the application. For example, if your application handles credit card transactions, you need to enable auditing to keep audit trails of each transaction. IBM Cloud Private leverages the cloud-agnostic industry standard format, Cloud Auditing Data Federation (CADF). See more details at Audit logging in IBM Cloud Private.

CADF event catches following information:

  • initiator_id: ID of the user that performed the operation
  • target_uri: CADF specific target URI, (for example: data/security/project)
  • action: The action being performed, typically: operation : resource_type

Factor XVIII: Securable (Identity, Network, Scope, Certificates)

Protect your app and resources from the outsiders

This factor deserves its own article. Suffice it to say that applications need end-to-end security when running in production. IBM Cloud Private address the following and more for security that is required for production environment:

  • Authentication: Confirm identities
  • Authorization: Validate what authenticated users can access
  • Certificate management: Manage digital certificates, including creation, storage, and renew
  • Data protection: Security measures for data in transit and at rest
  • Network security and isolation: Prevent unauthorized users and process from accessing the network
  • Vulnerability Advisor: Identify any security vulnerabilities in the images
  • Mutation Advisor: Identify any mutation in containers

You can learn more from the IBM Cloud Private security guide.

Specifically, let’s talk about certificate manager. IBM Cloud Private Certificate Manager service is based on the open source Jetstack project. Certificate Manager is used to issue and manage certificates for services that run on IBM Cloud Private. It supports both self-signed and public certificates, fully integrated with kubectl and role based access control.

Factor XIX: Measurable

Application usage should be measurable for quota or chargebacks

At the end of day, IT central has to handle the cost as seen in the following image. The compute resources allocated to run the containers should be measurable, and organizations using the cluster should be accountable. Make sure you follow Factor XIV: Schedulable. IBM Cloud Private provides metering, which collects allocated compute resources for each container and aggregates at namespace scope for showback and chargeback.

Conclusion

I hope you have found this topic interesting and have checked off the factors you already use and plan to use the others next time.

If you’d like to learn more, check out the talk that Michael Elder and I gave at KubeCon 2019, Shanghai, about the 12 + 7 factors for the Kubernetes model of container orchestration.

Was this article helpful?
YesNo

More from Cloud

Fortressing the digital frontier: A comprehensive look at IBM Cloud network security services

6 min read - The cloud revolution has fundamentally transformed how businesses operate. Its superior scalability, agility and cost-effectiveness have made it the go-to platform for organizations of all sizes. However, this shift to the cloud has introduced a new landscape of ever-evolving security threats. Data breaches and cyberattacks continue to hit organizations, making robust cloud network security an absolute necessity. IBM®, a titan in the tech industry, recognizes this critical need, provides a comprehensive suite of tools and offers unmatched expertise to fortify…

How well do you know your hypervisor and firmware?

6 min read - IBM Cloud® Virtual Private Cloud (VPC) is designed for secured cloud computing, and several features of our platform planning, development and operations help ensure that design. However, because security in the cloud is typically a shared responsibility between the cloud service provider and the customer, it’s essential for you to fully understand the layers of security that your workloads run on here with us. That’s why here, we detail a few key security components of IBM Cloud VPC that aim…

New IBM study: How business leaders can harness the power of gen AI to drive sustainable IT transformation

3 min read - As organizations strive to balance productivity, innovation and environmental responsibility, the need for sustainable IT practices is even more pressing. A new global study from the IBM Institute for Business Value reveals that emerging technologies, particularly generative AI, can play a pivotal role in advancing sustainable IT initiatives. However, successful transformation of IT systems demands a strategic and enterprise-wide approach to sustainability. The power of generative AI in sustainable IT Generative AI is creating new opportunities to transform IT operations…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters