Using IBM Cloud Internet Services for application high availability and performance
By: Jack Mitchell
You have an Internet facing application, or website, or API running in IBM Cloud that must be highly available and resilient to not only a datacenter or zone wide failures, but also region wide failures. In addition, it must be deployed in a distributed fashion such that clients or users are routed to the closest origin for best performance. In this article we will walk you through a simple configuration with IBM Cloud Internet Service that addresses these concerns and looks like the following when done:
IBM Cloud Internet Service (CIS) is a suite of services that include DNS, Global Load Balancing (GLB), Web Application Firewall (WAF), DDoS mitigation, and Content Delivery Network (CDN). We’ll focus on a simple, yet very powerful use case with CIS that utilizes GLB functionality, and IBM Cloud’s global footprint, to make applications highly resilient and performant.
CIS GLB is based on a very advanced DNS Authoritative Server functionality from Cloudflare™ that includes advanced health checks and geo-steering capabilities. The Authoritative DNS Servers are implemented using a robust anycast network around the globe that, as of this writing, includes 150+ points of presence and is growing rapidly.
Global internet facing applications, whether they are deployed on a public cloud or on on-premises, must be prepared to deal with a variety of failures. These failures could occur within the application itself, or the compute, storage, and network infrastructure components that are utilized in delivering the application to the end user or business. Some of these factors, like the application itself, may be in the customer’s control. However, many others are so diverse and distributed, and not in the control of a single entity, that it behooves the application owner to take appropriate steps to expect unforeseen events and plan for them so that the application can remain available and perform well in the event of inevitable failures. With IBM Cloud’s global footprint, and the capabilities of CIS, there is simply no excuse to incur any downtime for your applications.
The best practice to run your web application or service on IBM Cloud is to replicate it in multiple data centers (or Zones) within a given location (or Region), as well as across multiple such locations and regions. Of course the application must be configured to behave identically at all locations, and must be designed in such a way that its persistent state (for example, whether it is managed using a distributed database) is consistent and highly available. In addition to carefully designing the application to run across multiple locations with a “single source of truth”, client traffic must be routed to one of the available application locations (“origins”) that is (a) not down, and (b) that offers optimal network performance (latency, throughput) for that client’s location.
How to use CIS to achieve the desired traffic distribution
Assume that you own the domain “example.com”, and want to run your Internet-facing website, application, service or API at mysite.example.com. Let’s further assume that you have deployed your application in two different regions (US-South and EU-GB), as recommended above, with two zones in each region (DAL12, DAL13, LON02 and LON04), for a total of four application addresses or origins.
The first step is to add that domain to IBM CIS. Note that CIS is not a DNS Registrar, so we assume that you have procured the domain through your favorite Registrar. First locate “Internet Services” on the IBM Cloud console (https://console.bluemix.net).
Clicking on the service tile redirects you to an informational page that describes the available subscription plans for the service. Pick an appropriate plan and click on Create. This instantiates a CIS resource instance for you and displays the CIS “Overview” page as follows.
Enter your domain “example.com” in the dialog box and click on Connect and continue. At this time, you will be allocated 2 DNS name servers by the service (the hostname for the name servers will be of the format “nsXXX.name.cloud.ibm.com”). You are required to configure these two name servers for example.com at your Registrar.
Head to your Registrar’s portal to perform this operation. Once your Registrar adds the nameservers for your domain, the public DNS hierarchy will now be able to see that the DNS records for example.com can be found at the two nsXXX.name.cloud.ibm.com servers that were allocated. Until this action is complete, and the information is propagated in the DNS system, the domain is not activated on CIS (you can check with the following command on Linux or MacOS: “dig +short NS example.com”). It can be instantaneous or may take up to 24 hours, depending on the Registrar. You can request CIS to recheck the name servers sooner than the default scheduled probe time.
Note that in the above example, we are adding a domain “example.com”. However, if your domain is currently being managed by a different DNS provider, and you would like to delegate only a subdomain to CIS, you can do so. The procedure is almost exactly the same, except that instead of adding the name servers to your domain’s configuration at your DNS Registrar, you would add two “Name Server (NS)” records for your subdomain (for example, “subdomain.example.com” at your current DNS provider’s management portal). Note that to add these NS records for subdomain.example.com you will need administrative access to the DNS records of example.com (or have a process in place in your organization to request the addition of these records).
Once your domain (or subdomain) has been activated on CIS (the Overview page must show the domain as “Active”), you can configure DNS and GLB from the following screen:
For simplicity, let’s assume all four of your application locations have one IPv4 address each.
NOTE: Alternatively you may already have a DNS name for each of your origins, in which case you can skip this step.
Add DNS “A” records for each of your origins as follows:
Now you have four origins defined: dal12.example.com, dal13.example.com, lon02.example.com, and lon04.example.com.
It’s time to define the load balancer. Navigate to the “Global Load Balancer” section. You will see sections for “Load Balancers”, “Origin Pools”, and “Health Checks”. Start by defining your health checks: click on Create Health Check.
Explore “Advanced options” and “Configure request headers” to make any adjustments to the defaults as needed by your application.
Next, navigate to the “Origin Pools” section and click Create Origin Pool.
An origin pool is a collection of origins that are treated as one unit by the load balancer. Health checks are applied to the origin pools. As shown in the screenshot above, define an origin pool for Dallas, and another one for London. Add dal12.example.com and dal13.example.com to the “DAL” pool and lon02.example.com and lon04.example.com to the “LON” pool. You can also specify an email address to get notified of any health check failures.
Once the origin pools have been defined, head to the “Global Load Balancer” section and select Create load balancer to bring up a screen similar to this:
For the load balancer hostname, enter “mysite”. Now, define the default origin pools for this load balancer. These pools would be used if the location of the client attempting to reach mysite.example.com is not included in any of the defined geographic policies (this topic is addressed below).
Here, we are designating the Dallas based pool as the primary, and the London pool as the standby. This policy would be used as default. However, you can define “geo routes” as follows.
Here, we are configuring the GLB to service Western Europe clients primarily from the London pool, with Dallas as a standby.
Next, add one more Geo route as follows.
This is similar to the default, where Dallas is primary and London is the standby. You can define several such Geo routes depending on your needs, and everything else falls through to the default policy.
Finally, a word about SSL. Since your clients will reach you through mysite.example.com, to protect your site using SSL you must install a server certificate signed by a trusted Certificate Authority (CA). This certificate can be host-specific (“mysite.example.com”), or a wildcard (*.example.com”). Work with your favorite CA (for example, https://letsencrypt.com) to procure a certificate for your site. In fact, you can use CIS’ DNS functionality to procure the certificate from your CA using the Domain Control Validation (DCV) method, in which you will be required to configure a “TXT” record for your domain to prove domain ownership. All you have to do is add the specified TXT record to your domain, and the CA will issue a certificate after verifying the record. One way to do this is to use the “certbot” package on Linux (just run “certbot certonly –manual –preferred-challenges=dns -d <hostname>”) to kick off the DCV process and follow the prompts.
We’re done! You now have “mysite.example.com” running in a highly available configuration. It is immune to zone and region failures. For example, if the dal12 origin (dal12.example.com) goes offline, the Dallas pool will be served by the dal13 origin alone. If both dal12 and dal13 origins go down, the Default and Western North America clients would be redirected to London until Dallas recovers. And when all the pools are healthy, performance is optimized based on clients’ geo location.
If you configured email addresses for pool health, you will receive an email if the origin state changes. These events are also reflected in the “Health Check Events” tab on the Global Load Balancers page.
Here’s a pictorial view of the resulting configuration (from the beginning of this document).
This is just a very simple and effective example of how IBM CIS can be used to instantly improve your application’s availability and performance. And we didn’t even scratch the surface of the possibilities with CIS. The examples above included only DNS and GLB functionality. If you notice, the configuration of the load balancer for mysite.example.com had the “Proxy” setting “Off”. Turning the proxy “On” opens up a world of possibilities: CDN, Web Application Firewall (WAF), DDoS Protection, Custom Page Rules, and much more. Those are topics for another set of blogs, so stay tuned.