The root causes of web application performance issues can be divided into two categories:
- Your applications are performing poorly. This issue can typically be addressed by the application developers by tuning performance and making code corrections.
- Your applications are performing poorly because of Internet problems. Due to the unpredictable Internet infrastructure, this situation usually lies outside the control of any single company.
Whether your applications are performing poorly due to internal or external causes, slow response time affects users and this can lead to lower adoption rates and lower satisfaction with your applications.
Figure 1. The Internet is a complex, multipathed network of networks
Figure 1 represents that the Internet is not a single network but more than 12,000 networks connected by peering points. The path from the data center to the user can traverse several networks and peering points. Many times, the paths can be congested and there may be accidents, natural disaster or business disputes that can cause transmission delays.
One approach to tackle an under performing web application is to expand capacity by adding more servers, bandwidth and perhaps even more data centers for offloading. However, this approach is not cost-effective and also does not adequately address the unpredictable performance of the Internet. In addition, building out more infrastructures means that you may over provisioned when the traffic is low, and when the traffic is really high, you may still run into bottleneck.
This article describes a best practice using Akamai's EdgePlatform, an extensive network of more than 70,000 servers deployed in 70 countries and over 1000 networks, to monitor the internet in real time, gathering information about traffic, congestion, and trouble spots to eliminate long routes and avoid trouble spots to ensure fast and reliable delivery of your web content.
To help solve the congestion and vulnerability problems on the internet, Akamai developed the Akamai EdgePlatform, which offers products designed for various scenarios to mitigate the shortcomings of the internet in delivering web sites, content, and applications. In particular, we use the Akamai Web Application Accelerator that is designed to accelerate the delivery of dynamic web content by:
- Avoiding internet traffic congestion: The direct route between the origin server and the Akamai edger servers may be congested because of internet traffic. By using SureRoute for Performance you can overcome this problem by identifying alternate routes that are more optimal.
- Optimizing TCP configuration: Akamai is able to optimize TCP connection windows, tune TCP timeouts, and maximize the use of persistent connections to maximize the throughput between Akamai edge servers and the origin server.
- Caching content: Most static content such as images, videos, and zip files can be cached. By caching these objects on the edge server closer to the end users, Akamai can reduce the number of round trips to retrieve static content from the origin server.
- Pre-fetching content: Akamai provides intelligent pre-fetching so that it can deliver all embedded content in an HTML page to the end user with the fastest response time possible. When the end user's browser requests the embedded content, it is already waiting for the user, in memory, at the nearby edge server because Akamai has requested the content slightly in advance of the actual browser request.
- Accelerating time to transmit content: Akamai can compress the content on the edge servers using gzip before sending it to the client. This decreases the time it will take to transmit the content. Objects larger than 10KB can benefit from Last Mile Acceleration.
Designing for the Web Application Accelerator consists of two major tasks:
- Configure your system to recognize the edge servers.
- Design caching rules and policies that can be applied to your SaaS applications.
The first task is to get your system configured so that it recognizes the Akamai edge servers. Figure 2 shows the overall architecture.
Figure 2. Configure your system to recognize edge servers
When a user makes a request for an object, the request is first directed to the Akamai edge servers. If the content is available from the edge servers, it is returned to the user from the cache. If the content is not available, Akamai will go back to the origin server (i.e. our data center) and request the content, then cache it in the Akamai edge server network and return the content to the user.
To implement this task, a subject matter expert that understands your system topologies, such as the IP address and host name of the server that Akamai will talk to, the URL for your applications, whether it is secure or not, the domain names etc. can work with the Akamai specialist to set up the system.
The second task has to do with designing the caching rules and policies that can be applied to your SaaS applications. To implement this task, you need to understand the content of the website: the directory structure, the type of content, and how the content is generated. Several caching topics are covered in this section.
The simplest caching rules are based on standard file extensions. By default, files with these extensions are cached automatically:
aif aiff au avi bin bmp cab carb cct cdf class css doc dcr dtd exeflv gcf gff gif grv hdml hqx ico ini jpeg jpg js mov mp3 nc pct pdfpng ppc pws swa swf txt vbs w32 wav wbmp wml wmlc wmls wmlsc xsd zip
Based on the requirements of your application, you can easily add more file extensions or remove them from this list. For example, certain HTML or XML files on a particular file path (like /doc/*.html) may also be static and therefore can be cached as well.
You can also define caching rules for caching dynamic content based on request parameters or other application metadata. Typically the use of the URL for a resource alone may not be sufficient as the caching key. You need to supplement it with some additional cookies to make a composite cache key. The cookie can be used to identify a particular tenant or user from a tenant. This allows the edge servers to maintain user/tenant-specific copies of the content and serve the correct copy from the cache. This is a more advanced feature that needs to be carefully designed and tested.
To make sure the caching rules you defined are working as expected, use an HTTP debugging tool such as Fiddler as a transparent proxy so that you can see every request you made and validate the response to see if the caching is working properly or not.
The general approach is to request for a cacheable object and then use Fiddler's Session Inspector to inspect the headers in the request. You also need to add a rule in Fiddler so that it enables Akamai-specific Pragma headers (e.g. X-Check-Cacheable, X-Cache) in the request.
Requesting a cacheable object the first time
X-Check-Cacheable: YES X-Cache: TCP_MISS from a96-17-171-7 (AkamaiGHost/6.3.0-7086845)
Requesting a cacheable object the second time
Now make another request to the same file test.js. Since this is the second time the same object is being requested, you should expect to see the following attributes in the Response Headers panel:
X-Cache: TCP_HIT from a96-17-171-7 (AkamaiGHost/6.3.0-7086845)
The value TCP_HIT indicated that the Akamai edge server is returning the file from the cache and did not need to go back to our data center to fetch the same file.
Edge server cache poisoning is an unintended situation that provides data to the cache but does not represent the resource being cached. Since a resource cached by the edge server is returned in the next request, it is very important to make sure the cached data represents the correct content of that resource.
Following is a scenario that can potentially "poison" the edge server cache with invalid data.
Let's assume the end user is not yet authenticated and makes a request for a protected resource, say foo.gif. Let's assume foo.gif is not yet in the cache. The Akamai edge server sends the request to our data center.
We have a WebSeal server sitting in front of our back-end application servers. When a user requests a protected resource but has not yet authenticated (or their authentication session has expired), the initial request does not reach the back-end servers. Instead the WebSeal server returns an HTML login page (so that the user can login) with an HTTP response code 200 in place of the requested resource.
Akamai now receives the login page. Since the response code is 200, Akamai assumes the request was successful and saves the response into the cache. The Akamai cache is now poisoned with a wrong object. Next time a user asks for foo.gif, the response returns the login page from the Akamai cache.
Best practices to prevent poisoning the Akamai cache
There are two best practices to keep from poisoning the cache:
- Identify the servers properly in the response header.
- Ensure your applications return the proper status code.
Identify the servers properly in the response header. The scenario can be avoided by defining a configuration so that it will not cache anything served directly from a WebSeal reverse proxy server that performs redirects.
The Akamai software can check for the type of server by looking at the Server property
in the response header. For example, if the response comes from a WebSeal reverse
proxy server, the value of the server header is
Server: WebSEAL/18.104.22.168. If the
response is coming from one of our back-end application servers, say WebSphere sMash,
the value of the server header is
Server: IBM WebSphere sMash/1.1.1.
Ensure your applications return the proper status code. Failing to send the proper status code from your applications may seem harmless, but it can turn out to be a real problem when trying to enable to the edge server. Make sure an HTTP 200 response is sent back if and only if the response was successful and the returned data is the expected one. The edge server will automatically cache any cacheable content if the response code is 200.
In general, it is good practice to implement the HTTP status code and to create specific application-specific error pages for your web solutions.
Akamai provides a flexible Content Control Utility that you can use to perform this task. You can simply go into the Akamai Edge Control portal and use the panel to specify the URLs that you want to refresh. Akamai will notify you by email when the content has been refreshed. You can also programmatically refresh the content using their SOAP APIs.
This article covered the lessons learned as we integrate Akamai with our applications running in the IBM Cloud. Overall, we have seen more than 50 percent performance improvements with the Akamai caches in the picture.
Although this article focuses primarily on performance improvements through caching, Akamai provides additional capabilities that we leverage to enable failover and load distribution for our SaaS applications. This is a unique feature because cloud providers may not support specialized hardware for load balancing. The failover and load distribution integration improved the overall availability and performance of our SaaS applications. We will discuss this topic in future articles.
Our team enjoys working with the Akamai team. Not only are they very knowledgeable, they are fun to work with. The authors would like to thank Patrick Boucher from Akamai and Tom Mikalsen and Bhadri Madapusi from IBM for their assistance on this article.
Check out the resources mentioned in this article:
- Akamai EdgePlatform
- Akamai Web Application Accelerator
- Fiddler - a HTTP debugging tool
- HTTP Status Code
- Akamai Content Control Interfaces
- IBM Impact 2010: Day 3 - Tom Leighton, Akamai Technologies (part 1 of 2)
- IBM Impact 2010: Day 3 - Tom Leighton, Akamai Technologies (part 2 of 2)
- Learn more about WebSphere Application Acceleration products.
In the developerWorks cloud developer resources, discover and share knowledge and experience of application and services developers building their projects for cloud deployment.
The next steps: Find out how to access the IBM Cloud.
Get products and technologies
See the product images available on the IBM Cloud.
Join a cloud computing group on developerWorks.
Read all the great cloud blogs on developerWorks.
Join the developerWorks community, a professional network and unified set of community tools for connecting, sharing, and collaborating.
Christina is a Distinguished Engineer in WebSphere, experienced in such emerging technologies as cloud and mobile computing. Her current focus is on developing advanced technologies that support the delivery of online cloud services across the BPM, connectivity, and ILOG portfolio.
Valentina Birsan is a senior developer in WebSphere, currently focused on cloud projects. Previously Valentina was a technical lead on Rational Application Developer. Valentina was one of the initial members of the Eclipse TPTP open source project and served as the chair of the TPTP Architecture Group. She was the lead architect for the Cosmos Service Modeling Eclipse open source project and member of the SML open standard.