December 13, 2022 By Powell Quiring
Ahmed Osman
Arda Gumusalan
5 min read

Implement a scalable architecture that is resilient to node and availability zone failures.

IBM Cloud has a global network of multizone regions (MZRs) distributed around the world. Each zone has isolated power, cooling and network infrastructures.

This blog post presents an example architecture that utilizes a network load balancer (NLB) and is resilient to a zonal failure:

IBM Cloud Internet Services (ICS) provides security, reliability and performance to external web content. A global load balancer (GLB), as seen in the diagram above, can be configured to provide high availability by spreading load across zones.

IBM Cloud VPC load balancers

IBM Cloud Virtual Private Cloud (VPC) supports two types of load balancers: an application load balancer (ALB) and a network load balancer (NLB).

The right side of the diagram shows a VPC in an MZR with three zones. Health checks will allow the NLB to distribute connections to the healthy servers. In this example, the servers are in the same zone as the NLB, but it is possible to accept members across all zones using multi-zone support.

Why use a network load balancer instead of an application load balancer?

A network load balancer (NLB) works in Layer 4 and is best fit for workloads that require high throughput and low latency.

You may be asking why a separate network load balancer is needed if the application load balancer supports Layer 4 traffic. Often, a client will submit a request that is fairly small in size, with little performance impact on the load balancer; however, the information returned from the backend targets (virtual servers or container workloads) can be significant — perhaps several times larger than the client request. 

With Direct Server Return (DSR), the information processed by the backend targets is sent directly back to the client, thus minimizing latency and optimizing throughput performance.

Additionally, network load balancers have the following unique characteristics when compared to an application load balancer (for more information, see the Load Balancer Comparison Chart):

  • Source IP preservation: Network load balancers don’t NAT the client IP address. Instead, it is forwarded to the target server.
  • Fixed IP address: Network load balancers have a fixed IP address.

IBM Cloud Internet Services (CIS) global load balancer

Global load balancer (GLB) health checks allow for the distribution of requests across healthy NLB/servers:

Each red ‘X’ in the diagram above shows an unhealthy scenario (i.e., an unhealthy server detected by an NLB health check and an NLB or zonal failure that is detected by the CIS GLB health check).

This next diagram shows more concretely how the CIS GLB performs load balancing via DNS name resolution:

  1. The client requests the DNS name
  2. The client computer has a DNS resolver that contacts a web of DNS resolvers to determine the corresponding IP addresses. The diagram shows the client’s DNS resolver contacting an on-premises DNS resolver that will reach Cloudflare as the authoritative DNS Server for the IBM Cloud Internet Services and, therefore, the GLB
  3. A list of the NLB load balancers is returned, and one of those is used by the client. The order and weight of the origin pool members can be adjusted by configuring a global load balancer.
  4. The client uses the IP address to establish a TCP connection directly to a server through the NLB.

Provisioning the VPC instance

The first step is to use the IBM Cloud Console to create a Cloud Internet Services (CIS) instance if one is not available. A number of pricing plans are available, including a free trial. The provisioning process of a new CIS will explain how to configure your existing DNS registrar (probably outside of IBM) to use the CIS-provided domain name servers. The post uses for the DNS name.

Follow the instructions in the companion GitHub repository to provision the VPC, VSI, NLB and CIS Configuration on your desktop or in IBM Cloud Schematics. After provisioning is complete, the Terraform output will show test curl commands that can be executed to verify load is being balanced across zones via the GLB and across servers via the NLB.

Visit the IBM Cloud Console Resource list. Find and click on the name of the Internet Services product to open the CIS instance and navigate to the Reliability section of the CIS instance. Check out the Load balancer, origin pools and health checks. Navigate to the VPC Infrastructure and find the VPC, subnets, NLBs, etc. Verify that the CIS GLB is connected to the IP addresses of the VPC NLBs.

Kubernetes and OpenShift

The same architecture can be used for Red Hat OpenShift on IBM Cloud or IBM Cloud Kubernetes Service. The IBM Cloud Kubernetes Service worker nodes replace the servers in the original diagram:

Follow the instructions in the companion GitHub repository to provision the IKS, NLB, CIS Configuration on your desktop or in IBM Cloud Schematics. While the Kubernetes Service cluster is being provisioned, read on to understand the Kubernetes resources configured.

Kubernetes deployments, by default, will spread pods evenly across worker nodes (and zones). The example configuration uses a nodeSelector to place the pods on zone-specific worker nodes (like those in us-south-1) using an IBM Cloud node attribute shown in the cutdown below:

kind: Deployment
      app: cogs
    name: cogs-0
    namespace: default
    replicas: 2
        app: cogs
          app: cogs
          ... pod code

A Kubernetes service is configured to expose applications using load balancers for VPC. Each service is configured with a VPC NLB that can be access publicly. A service is created for each zone.

The service ingress is configured to keep the load in the worker node that receives the network request using externalTrafficPolicy: Local. The Kubernetes default policy will balance the load across all selected pods in all workers in all zones. The default may be preferred for your workload:

apiVersion: v1
kind: Service
    name: myloadbalancer1
    annotations: "nlb" "public" "us-south-1"
    type: LoadBalancer
        app: cogs
     - name: http
         protocol: TCP
         port: 80
    externalTrafficPolicy: Local

After the Terraform provision is complete, visit the IBM Cloud Console Resource list. Find and click on the name of the Internet Services product to open the CIS instance and navigate to the Reliability section of the CIS instance. Check out the Load balancer, origin pools and health checks. Note that the origins contain IP addresses of the Kubernetes Service VPC NLBs.

This can be verified using the cli. The Terraform output has a test_kubectl output that can be used to initialize the Kubernetes kubectl command-line tool. After initialization, get the services to see output like this:

$ kubectl get services
NAME                       TYPE           CLUSTER-IP       EXTERNAL-IP      PORT(S)                      AGE
kubernetes                 ClusterIP       <none>           443/TCP                      7d
load-balancer-us-south-1   LoadBalancer   80:30656/TCP,443:31659/TCP   121m
load-balancer-us-south-2   LoadBalancer    80:32152/TCP,443:32683/TCP   121m

Summary and next steps

The IBM Cloud Internet Services GLB is probing for health checks through the NLB to the server computers. This health check is a path very similar to a client accessing the servers. Under the extremely unlikely event of a zone failure, this architecture will continue to balance load across the remaining zones/workers. Each NLB has a static public IP address that remains fixed for the lifetime of the NLB, so the GLB will not need to be updated.

The TCP traffic in the example is not TLS encrypted. The TLS will need to be managed by the worker applications. IBM Cloud Secrets Manager can be used to automate the distribution of TLS certificates.

If you have feedback, suggestions or questions about this post, please email me or reach out to me on Twitter (@powellquiring).

Was this article helpful?

More from Cloud

Enhance your data security posture with a no-code approach to application-level encryption

4 min read - Data is the lifeblood of every organization. As your organization’s data footprint expands across the clouds and between your own business lines to drive value, it is essential to secure data at all stages of the cloud adoption and throughout the data lifecycle. While there are different mechanisms available to encrypt data throughout its lifecycle (in transit, at rest and in use), application-level encryption (ALE) provides an additional layer of protection by encrypting data at its source. ALE can enhance…

Attention new clients: exciting financial incentives for VMware Cloud Foundation on IBM Cloud

4 min read - New client specials: Get up to 50% off when you commit to a 1- or 3-year term contract on new VCF-as-a-Service offerings, plus an additional value of up to USD 200K in credits through 30 June 2025 when you migrate your VMware workloads to IBM Cloud®.1 Low starting prices: On-demand VCF-as-a-Service deployments begin under USD 200 per month.2 The IBM Cloud benefit: See the potential for a 201%3 return on investment (ROI) over 3 years with reduced downtime, cost and…

The history of the central processing unit (CPU)

10 min read - The central processing unit (CPU) is the computer’s brain. It handles the assignment and processing of tasks, in addition to functions that make a computer run. There’s no way to overstate the importance of the CPU to computing. Virtually all computer systems contain, at the least, some type of basic CPU. Regardless of whether they’re used in personal computers (PCs), laptops, tablets, smartphones or even in supercomputers whose output is so strong it must be measured in floating-point operations per…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters