Best practices to accelerate web content delivery in the cloud

Use Akamai platform for speed and reliability

Browser-based Software as a Service (SaaS) applications allow companies to connect quickly and easily with users all over the world. However, the ability to deliver these applications consistently, reliably and with high-performance becomes a critical success factor. In this third article in a series on best practices for building multi-tenant applications on the IBM Cloud, the authors address web application performance. They describe a solution that integrates SaaS frameworks with Akamai global edge-delivery platform for reliable web content delivery.

Christina Lau, Distinguished Engineer, IBM

Christina LauChristina Lau is a distinguished engineer in WebSphere, experienced in such emerging technologies as cloud and mobile computing. Her current focus is on developing advanced technologies that support the delivery of online cloud services across the BPM, connectivity, and ILOG portfolio.


developerWorks Contributing author
        level

Valentina Birsan (popescu@ca.ibm.com), Senior Developer, IBM

Valentina Birsan's photoValentina Birsan is a senior developer in WebSphere, currently focused on cloud projects. Previously Valentina was a technical lead on Rational Application Developer. Valentina was one of the initial members of the Eclipse TPTP open source project and served as the chair of the TPTP Architecture Group. She was the lead architect for the Cosmos Service Modeling Eclipse open source project and member of the SML open standard.



04 April 2011

Also available in Chinese Japanese

The root causes of web application performance issues can be divided into two categories:

  • Your applications are performing poorly. This issue can typically be addressed by the application developers by tuning performance and making code corrections.
  • Your applications are performing poorly because of Internet problems. Due to the unpredictable Internet infrastructure, this situation usually lies outside the control of any single company.

Whether your applications are performing poorly due to internal or external causes, slow response time affects users and this can lead to lower adoption rates and lower satisfaction with your applications.

Figure 1. The Internet is a complex, multipathed network of networks
Image of a complex, multipathed network of networks

Figure 1 represents that the Internet is not a single network but more than 12,000 networks connected by peering points. The path from the data center to the user can traverse several networks and peering points. Many times, the paths can be congested and there may be accidents, natural disaster or business disputes that can cause transmission delays.

Internet peering

Peering is a voluntary interconnection of administratively separate Internet networks for the purpose of exchanging traffic between the customers of each network. The pure definition of peering is settlement-free or "sender keeps all" — this means that neither party pays the other for the exchanged traffic and instead, each derives revenue from its own customers.

One approach to tackle an under performing web application is to expand capacity by adding more servers, bandwidth and perhaps even more data centers for offloading. However, this approach is not cost-effective and also does not adequately address the unpredictable performance of the Internet. In addition, building out more infrastructures means that you may over provisioned when the traffic is low, and when the traffic is really high, you may still run into bottleneck.

This article describes a best practice using Akamai's EdgePlatform, an extensive network of more than 70,000 servers deployed in 70 countries and over 1000 networks, to monitor the internet in real time, gathering information about traffic, congestion, and trouble spots to eliminate long routes and avoid trouble spots to ensure fast and reliable delivery of your web content.

Best practice: Deliver dynamic web content with Akamai Web Application Accelerator

To help solve the congestion and vulnerability problems on the internet, Akamai developed the Akamai EdgePlatform, which offers products designed for various scenarios to mitigate the shortcomings of the internet in delivering web sites, content, and applications. In particular, we use the Akamai Web Application Accelerator that is designed to accelerate the delivery of dynamic web content by:

  • Avoiding internet traffic congestion: The direct route between the origin server and the Akamai edger servers may be congested because of internet traffic. By using SureRoute for Performance you can overcome this problem by identifying alternate routes that are more optimal.
  • Optimizing TCP configuration: Akamai is able to optimize TCP connection windows, tune TCP timeouts, and maximize the use of persistent connections to maximize the throughput between Akamai edge servers and the origin server.
  • Caching content: Most static content such as images, videos, and zip files can be cached. By caching these objects on the edge server closer to the end users, Akamai can reduce the number of round trips to retrieve static content from the origin server.
  • Pre-fetching content: Akamai provides intelligent pre-fetching so that it can deliver all embedded content in an HTML page to the end user with the fastest response time possible. When the end user's browser requests the embedded content, it is already waiting for the user, in memory, at the nearby edge server because Akamai has requested the content slightly in advance of the actual browser request.
  • Accelerating time to transmit content: Akamai can compress the content on the edge servers using gzip before sending it to the client. This decreases the time it will take to transmit the content. Objects larger than 10KB can benefit from Last Mile Acceleration.

Designing for the Web Application Accelerator consists of two major tasks:

  1. Configure your system to recognize the edge servers.
  2. Design caching rules and policies that can be applied to your SaaS applications.

Task 1: Configure to recognize edge servers

The first task is to get your system configured so that it recognizes the Akamai edge servers. Figure 2 shows the overall architecture.

Figure 2. Configure your system to recognize edge servers
Configure your system to recognize edge servers

When a user makes a request for an object, the request is first directed to the Akamai edge servers. If the content is available from the edge servers, it is returned to the user from the cache. If the content is not available, Akamai will go back to the origin server (i.e. our data center) and request the content, then cache it in the Akamai edge server network and return the content to the user.

To implement this task, a subject matter expert that understands your system topologies, such as the IP address and host name of the server that Akamai will talk to, the URL for your applications, whether it is secure or not, the domain names etc. can work with the Akamai specialist to set up the system.


Task 2: Design caching rules

The second task has to do with designing the caching rules and policies that can be applied to your SaaS applications. To implement this task, you need to understand the content of the website: the directory structure, the type of content, and how the content is generated. Several caching topics are covered in this section.

Caching rules based on file extensions and paths

The simplest caching rules are based on standard file extensions. By default, files with these extensions are cached automatically:

aif aiff au avi bin bmp cab carb cct cdf class css doc dcr dtd
exeflv gcf gff gif grv hdml hqx ico ini jpeg jpg js mov mp3 nc pct
pdfpng ppc pws swa swf txt vbs w32 wav wbmp wml wmlc wmls
wmlsc xsd zip

The assumption is that these text, image, stylesheets, audio files can be cached because they are static; that performance is increased by having them in the edge server caches. The default caching time is one day. In particular, for our rich internet applications where JavaScript plays a significant role, Akamai can cache a large number of objects for us and significantly accelerate our overall performance.

Based on the requirements of your application, you can easily add more file extensions or remove them from this list. For example, certain HTML or XML files on a particular file path (like /doc/*.html) may also be static and therefore can be cached as well.

You can also define caching rules for caching dynamic content based on request parameters or other application metadata. Typically the use of the URL for a resource alone may not be sufficient as the caching key. You need to supplement it with some additional cookies to make a composite cache key. The cookie can be used to identify a particular tenant or user from a tenant. This allows the edge servers to maintain user/tenant-specific copies of the content and serve the correct copy from the cache. This is a more advanced feature that needs to be carefully designed and tested.

Tools and techniques to test the caching rules

To make sure the caching rules you defined are working as expected, use an HTTP debugging tool such as Fiddler as a transparent proxy so that you can see every request you made and validate the response to see if the caching is working properly or not.

The general approach is to request for a cacheable object and then use Fiddler's Session Inspector to inspect the headers in the request. You also need to add a rule in Fiddler so that it enables Akamai-specific Pragma headers (e.g. X-Check-Cacheable, X-Cache) in the request.

Requesting a cacheable object the first time

As an example, one of our caching rules indicates that we will cache JavaScript. So we can make a request for a file, say test.js. Since this is the first time you make the request, you should expect to see the following attributes in the Response Headers panel:

X-Check-Cacheable: YES
X-Cache: TCP_MISS from a96-17-171-7 (AkamaiGHost/6.3.0-7086845)

The first attribute validated that we are indeed caching JavaScript files with a .js extension. The second attribute indicates that although the object is cacheable, the test.js file was not found in the Akamai cache so it was delivered from our data center.

Requesting a cacheable object the second time

Now make another request to the same file test.js. Since this is the second time the same object is being requested, you should expect to see the following attributes in the Response Headers panel:

X-Cache: TCP_HIT from a96-17-171-7 (AkamaiGHost/6.3.0-7086845)

The value TCP_HIT indicated that the Akamai edge server is returning the file from the cache and did not need to go back to our data center to fetch the same file.

What is cache poisoning?

Edge server cache poisoning is an unintended situation that provides data to the cache but does not represent the resource being cached. Since a resource cached by the edge server is returned in the next request, it is very important to make sure the cached data represents the correct content of that resource.

Following is a scenario that can potentially "poison" the edge server cache with invalid data.

Let's assume the end user is not yet authenticated and makes a request for a protected resource, say foo.gif. Let's assume foo.gif is not yet in the cache. The Akamai edge server sends the request to our data center.

We have a WebSeal server sitting in front of our back-end application servers. When a user requests a protected resource but has not yet authenticated (or their authentication session has expired), the initial request does not reach the back-end servers. Instead the WebSeal server returns an HTML login page (so that the user can login) with an HTTP response code 200 in place of the requested resource.

Akamai now receives the login page. Since the response code is 200, Akamai assumes the request was successful and saves the response into the cache. The Akamai cache is now poisoned with a wrong object. Next time a user asks for foo.gif, the response returns the login page from the Akamai cache.

Best practices to prevent poisoning the Akamai cache

There are two best practices to keep from poisoning the cache:

  • Identify the servers properly in the response header.
  • Ensure your applications return the proper status code.

Identify the servers properly in the response header. The scenario can be avoided by defining a configuration so that it will not cache anything served directly from a WebSeal reverse proxy server that performs redirects.

The Akamai software can check for the type of server by looking at the Server property in the response header. For example, if the response comes from a WebSeal reverse proxy server, the value of the server header is Server: WebSEAL/6.1.0.0. If the response is coming from one of our back-end application servers, say WebSphere sMash, the value of the server header is Server: IBM WebSphere sMash/1.1.1.

Ensure your applications return the proper status code. Failing to send the proper status code from your applications may seem harmless, but it can turn out to be a real problem when trying to enable to the edge server. Make sure an HTTP 200 response is sent back if and only if the response was successful and the returned data is the expected one. The edge server will automatically cache any cacheable content if the response code is 200.

In general, it is good practice to implement the HTTP status code and to create specific application-specific error pages for your web solutions.

Clearing the caches

Sometimes it is necessary to refresh the content cached in Akamai edge servers without waiting for expiration that is specified in the edge server configuration. For example, you may need to make a quick patch for an error in a JavaScript or update an existing image. Or you may want to refresh your entire application with a newer version of the application.

Akamai provides a flexible Content Control Utility that you can use to perform this task. You can simply go into the Akamai Edge Control portal and use the panel to specify the URLs that you want to refresh. Akamai will notify you by email when the content has been refreshed. You can also programmatically refresh the content using their SOAP APIs.


In conclusion

This article covered the lessons learned as we integrate Akamai with our applications running in the IBM Cloud. Overall, we have seen more than 50 percent performance improvements with the Akamai caches in the picture.

Although this article focuses primarily on performance improvements through caching, Akamai provides additional capabilities that we leverage to enable failover and load distribution for our SaaS applications. This is a unique feature because cloud providers may not support specialized hardware for load balancing. The failover and load distribution integration improved the overall availability and performance of our SaaS applications. We will discuss this topic in future articles.


Acknowledgement

Our team enjoys working with the Akamai team. Not only are they very knowledgeable, they are fun to work with. The authors would like to thank Patrick Boucher from Akamai and Tom Mikalsen and Bhadri Madapusi from IBM for their assistance on this article.

Resources

Learn

Get products and technologies

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Cloud computing on developerWorks


  • Bluemix Developers Community

    Get samples, articles, product docs, and community resources to help build, deploy, and manage your cloud apps.

  • Cloud digest

    Complete cloud software, infrastructure, and platform knowledge.

  • DevOps Services

    Software development in the cloud. Register today to create a project.

  • Try SoftLayer Cloud

    Deploy public cloud instances in as few as 5 minutes. Try the SoftLayer public cloud instance for one month.

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Cloud computing
ArticleID=644236
ArticleTitle=Best practices to accelerate web content delivery in the cloud
publish-date=04042011