In each column, The WebSphere Contrarian answers questions, provides guidance, and otherwise discusses fundamental topics related to the use of WebSphere products, often dispensing field-proven advice that contradicts prevailing wisdom.
I’ve spent a great deal of time over the past several weeks delivering internal training to our technical sales specialists on the just-released IBM® WebSphere® Application Server V7. A portion of that training had been allotted to back-to-basics material to provide background for those who are new to WebSphere Application Server, as well as a refresher for those who are not so new. As it turned out, a good many of the “not so new” specialists found value in the lab and lecture I delivered -- many more than I had anticipated. It’s true: you can never be so experienced that you don’t need to be reminded of the basics.
In that same vein, I decided to cover a basic aspect of WebSphere Application Server in this column: session failover. More precisely: session failover and my recommendations for failover.
Hypertext Transfer Protocol (HTTP), which is the basis for Web applications, is a stateless protocol. State is generally managed by setting cookies, which are tracked by the browser across requests, providing a mechanism for associating a series of unique requests from a given user into a sort of “conversation” or flow. Unfortunately, managing state using cookies is not easy. Fortunately the Java Servlet API provides a solution to this shortcoming with the HttpSession interface for session tracking and state management across multiple request invocations via a session object, which enables servlets to associate a given user to a series of requests.
Unfortunately, session specific data can be stored into session objects. I say “unfortunately” because HttpSession often ends up being used as an application cache, which it really wasn’t intended to be. While there’s nothing wrong with using the session to associate requests from a user, the use of a session as an application cache tends to lend itself to a number of application practices that can stress many aspects of the application environment. For example, one customer had a 25 MB (no misprint) session object that they were persisting to a database, the result was:
- Poor database performance, as the database performed hundreds of updates to the (very) large session objects in database tables.
- Degraded network performance, as the updates were pushed from the application servers to the database server.
- Frequent garbage collections for the application servers, as heap that could otherwise be used for application processing was consumed by session objects associated with each user; for example, 200 users X 25 MB = 500 MB (or 0.5 GB) in session objects! Even a 1GB or 2GB heap is “small” for storing this much application state data.
This inappropriate use of HttpSession leads, in turn, to a concern over loss of what’s in the session object, since it is time consuming to recreate the information stored there.
As you might have guessed, my recommendation is to not use HttpSession as an application cache and instead use HttpSession for associating multiple user requests. It is best to keep your HttpSessions as small as possible. However, in some applications, this advice wasn’t followed and therefore you might need a "quick" way to change the application to reduce what is persisted with minimal impact. In this case, I’d recommend using the techniques outlined in an article called Improving HttpSession Performance with Smart Serialization to minimize what’s actually stored in the session object, which will minimize what the application server has to deal with when you’ve chosen to employ session distribution.
Session distribution: when and how
The Servlet API makes a provision for distributing session objects. Stated another way, you can employ copies of the session objects so that the failure of an application server instance does not result in the loss of application state for a given user -- or more precisely, users, since multiple users are likely associated with a given application server instance using the HttpSession affinity mechanism that WebSphere Application Server provides.
My preferred alternative is to rely not on session distribution, but instead to rely simply on HTTP server plug-in affinity to “pin” a user to an application server, although this does mean that stopping an application server JVM will result in the loss of the HttpSession object. The benefit of doing so is that there is no need to distribute the session objects to provide for HttpSession object failover when an application server fails or is stopped. The obvious down side is that a user will lose any application state and will need to log back in and recreate it, and this may or may not be acceptable for your application or business requirements. I'll mention that I’ve worked with a number of customers that in fact agree with this view and make this their standard practice.
That said, perhaps you really, really can’t tolerate someone having to log back in and recreate the HttpSession object from scratch. Maybe this is because you have, unfortunately, chosen to use HttpSession as an application cache, or maybe it’s for some other reason. In any event, WebSphere Application Server Network Deployment (hereafter referred to as Network Deployment) provides two mechanisms for session distribution: Distributed Replication Service (DRS) and database persistence. My recommendation for a Network Deployment session distribution mechanism remains the same as stated in an earlier column, which is to employ database persistence.
Another alternative for session distribution is WebSphere eXtreme Scale, formerly known as the ObjectGrid component of WebSphere Extended Deployment. WebSphere eXtreme Scale provides a memory-based replication mechanism that is independent of the application server runtime. As a result, WebSphere eXtreme Scale can be used to share application state (or cache) between different applications running on different application server runtimes. Specific to HttpSession, WebSphere eXtreme Scale provides a servlet filter that overrides the HttpSession implementation for any Java Platform, Enterprise Edition EAR file. This filter is easily installed into the EAR via a script provided in WebSphere eXtreme Scale.
I’ll add that if you are in fact using HttpSession as an application cache, WebSphere eXtreme Scale is likely a better alternative than what’s provided in Network Deployment for session distribution. This is because WebSphere eXtreme Scale was designed as a distributed application cache, whereas HttpSession was never intended for this use, although when HttpSession is used efficiently, as described above, the distributed session options in Network Deployment are efficient and scalable.
Related to the decision to use session distribution is another decision to share a session between Network Deployment cells. My recommendation, as I've discussed before is not to do so, since it adds complexity and interdependence between your Network Deployment cells and, by extension, your data centers, which is what you want to avoid if you suffer a catastrophic outage. While a number of customers seem to accept this, they still persist in sharing session between cells of data centers, because they’re unable to properly associate a user with an application server in a given data center (or cell).
Fortunately, there is a solution for this that you can apply at the network layer, provided you have the correct infrastructure in place and the cooperation of all the affected operational entities in your enterprise. If you’ve created multiple independent cells, as shown in Figure 1, you can configure your network switch (or global site selector, as it’s often known as) to correctly maintain affinity for a given user to a specific cell.
Figure 1. Infrastructure with multiple independent cells
How do you achieve this configuration, you ask? Well, most devices of this type rely on IP layer 2 or IP layer 3 mechanisms to maintain affinity from a client to a server; some examples of these mechanisms are an IP address hash, DNS resolution, or something of that nature. These types of affinity mechanisms are fine if all incoming traffic is routed consistently on every request. But if the traffic is routed via proxy servers or other intermediary devices that end up masking the “real” client IP address, these types of affinity mechanisms are not sufficient for ensuring that client requests are consistently routed to the same servers (or cell).
Unfortunately, such masking or indirection occurs all too often, and sometimes it happens to a given user across multiple requests; one time he’s routed via Proxy A and the next time via Proxy B. The result is that the network switch treats these as different user requests (even though that’s not the case), and so the user loses his application state if you have not distributed session between the cells (or data centers).
The solution relies on employing content-based routing at the network switch, which is an IP layer 7 technology. The key to this method is to extract the Server CloneID (below) for each server from the plugin-cfg.xml file that is used by the HTTP server plug-in, and then use that ID to construct affinity rules for each cell. The network switch can then examine the incoming HTTP request header and determine to which cell it should route a request. In this case below, a rule would be constructed to examine the HTTP request header for “13j9n75hm”.
<Server CloneID="13j9n75hm" ConnectTimeout="0" ExtendedHandshake="false" LoadBalanceWeight="2" MaxConnections="-1"
Doing so for every CloneID in the plugin-cfg.xml file in each cell and associating the CloneID with a cell via a routing rule eliminates the need to share session between cells -- at least for this reason.
Of course, some would argue that it’s “too expensive” in terms of network switch overhead (and latency) to implement content-based routing. I won’t disagree that it’s expensive, but it you don’t deal with this at the front end for incoming requests, you end up “paying more” by adding network capacity at the back end to replicate HttpSession data, and you will create a cross data center dependency that could hurt reliability in other ways.
Using the affinity approach discussed here only leaves you vulnerable to a loss of session if you experience an unplanned outage. This is because with a planned outage you can "drain" one cell using the techniques discussed in this article on maintaining continuous availability so that existing users remain in the initial cell while new users are directed to the newly active cell. From a practical perspective, if you are experiencing a large number of unplanned outages, then you have some fundamental issues that need to be addressed; primarily, what's the root cause of the unplanned outages? At least that’s what I would investigate first. Because of the additional complexity of sharing session between cells, and as covered in reliability engineering, this situation will likely to lead to additional failures, which will lead to outages, which is what you were trying to avoid in the first place!
Thanks to Keys Botzum and Alex Polozoff for their comments and suggestions.
-
Hypertext Transfer Protocol
-
Improving HttpSession Performance with Smart Serialization
-
Everything you always wanted to know about WebSphere Application Server but were afraid to ask
-
Everything you always wanted to know about WebSphere Application Server but were afraid to ask, Part 5
-
Maintain continuous availability while updating WebSphere Application Server enterprise applications
Tom Alcott is consulting IT specialist for IBM in the United States. He has been a member of the Worldwide WebSphere Technical Sales Support team since its inception in 1998. In this role, he spends most of his time trying to stay one page ahead of customers in the manual. Before he started working with WebSphere, he was a systems engineer for IBM's Transarc Lab supporting TXSeries. His background includes over 20 years of application design and development on both mainframe-based and distributed systems. He has written and presented extensively on a number of WebSphere run time issues.




