How to Set Up Multi-Zone Resiliency with IBM Cloud VPC

3 min read

Investigating a use case for setting up a resilient infrastructure in a Multi-Zone Region (MZR).

Our first blog post talked about the resilient features provided by IBM Cloud VPC. In this post, we continue exploring cloud infrastructure resiliency by focusing on a use case for setting up a resilient infrastructure in a Multi-Zone Region (MZR). The IBM Cloud VPC infrastructure uses Intel Xeon CPUs and additional Intel technologies.

Basically, an MZR consists of three availability zones. You can use one, two or three of the availability zones. You are not required to use all three availability zones, but to get a more resilient solution, it is best to use all three. With zero extra costs to use all three zones, you get the maximum benefit for one price. Sure, it might be easier and quicker to build everything in a single availability zone, but that can limit the resiliency of your solution.

Choose the components and features you need

Let’s look at an n-tier application. By spreading the infrastructure resources across multiple availability zones, you avoid a single point of failure. Furthermore, if there are any events or issues in an availability zone, they are contained in that availability zone. Your resources in the other two availability zones are still protected and available at a lower capacity, and you avoid a hard down situation:

Your resources in the other two availability zones are still protected and available at a lower capacity, and you avoid a hard down situation:
  • Compute: Create multiple virtual server instances (VSI) of the same capabilities and distribute them across the three zones to avoid a single point of failure. Consider using auto-scale for the compute resources, especially for the web and application tier. With auto-scale, you create multiple instances from a forked image that scale up or down as needed and help control costs. 
  • Load balancers: Application load balancers (ALB) support cross zones and ensure that the requests are distributed across the different zones. You can create stickiness to ensure that the same requestor hits the same resources if required. In addition, the load balancers are also deployed in HA fashion.
  • Snapshots: Take snapshots of the DB storage to use for backups in case the data is corrupted or accidentally deleted.

Use scripts to automate your deployment

Use Terraform to help create a standardized deployment for your resilient infrastructure. Avoid the need for manual creation, which can be time-consuming and prone to some human error due to missing features, misconfiguration or not using best practices. Coupled with IBM Cloud Schematics, you can create workspaces to help manage the different deployment models that you need based on your application requirements. Automation allows for quick repeatable deployments for different applications and different regions.

We created a sample Terraform code for MZR deployment that can be used as a guide for either a learning tool or starter kit. The code does the following:

  • Creates a new VPC
  • Uses all three availability zones
  • Creates a bastion server for management with load balancers
  • Enables some features to help with resiliency

Within 25 minutes, you can have a three-tier architecture deployed in your IBM Cloud.

  • Step 1: Download the sample code.
  • Step 2: Edit the inputvar file.
  • Step 3: Execute Terraform commands.
  • Step 4: Apply and deploy.

For more information on the script and how to use Terraform see, “Creating a resilient three-tier highly available infrastructure VPC with Auto scale by using Terraform.”

Be prepared

Be aware of upcoming maintenance done by IBM. IBM is always improving, patching or hardening the IaaS offerings. By staying in the know, you can further improve your resiliency and anticipate these planned maintenance instances to improve your service level objectives (SLO). Knowing when maintenance is going to be done in the region or availability zone, you can proactively and gracefully route the traffic to a non-impacted site ahead of the maintenance window. Being proactive ensures that the maintenance does not impact any existing users. After the maintenance, you can add the site back into the mix.

Additional resources

Be the first to hear about news, product updates, and innovation from IBM Cloud