Skip to main content

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. Select information in your profile (name, country/region, and company) is displayed to the public and will accompany any content you post. You may update your IBM account at any time.

All information submitted is secure.

  • Close [x]

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

  • Close [x]

Best Practice: Improving HttpSession Performance with Smart Serialization

Kyle Brown (brownkyl@us.ibm.com), Senior Technical Staff Member, IBM
Kyle Brown is Senior Technical Staff Member for the J2EE Architecture and Web Services team. He works with some of IBM's largest WebSphere customers to help them develop best practices to solve complex business problems
Keys Botzum (keys@us.ibm.com), Senior Consulting I/T Specialist, IBM
Keys Botzum is a senior consultant with IBM Software Services for WebSphere. He hasover 10 years of experience in large scale distributed system design and additionallyspecializes in security. Mr. Botzum has worked with a variety of distributed technologies,including Sun RPC, DCE, CORBA, AFS, and DFS. Recently, he has been focusing onJ2EE and related technologies. He holds a Masters degree in Computer Science fromStanford University and a B.S. in Applied Mathematics/Computer Science from CarnegieMellon University.

Summary:  One of the most persistent problems in developing Web applications in Java is how to best handle session state. This best practice discusses using transient variables to enable WebSphere® to selectively serialize objects, thus improving performance.

Date:  24 Nov 2003
Level:  Intermediate

Activity:  24304 views
Comments:  

Introduction

One of the most persistent problems in developing Web applications in Java is how to best handle session state. There is a balance that you must seek to maintain between storing too much session state that leads to performance problems in storing large session objects. Storing too little session state adversely affects the functionality of an application, or leads to performance problems when developers must constantly re-create the same set of objects. This is not a new problem, and the performance problems produced by storing sessions that are too large have been well documented (for example, see WebSphere Application Server Development Best Practices for Performance and Scalability). Many applications have tried to work around this by using various techniques to make the session smaller. Unfortunately, these approaches lead to solutions that are complex and force each application developer to remember what is in the HttpSession "state" and what is in some other "session." This best practice discusses using transient variables to enable WebSphere® to selectively serialize objects, thus improving performance.

This best practice applies to the following product, version, and plaform:

WebSphere Application Server Network Deployment versions 4.x and 5.x (AIX, Windows NT, Linux, Solaris)


Recommendation

Application servers go to great lengths to ensure that session management is efficient, but there are limits to what they can do without application domain knowledge. Recall that there are three basic approaches to keeping session state valid across multiple clustered application servers:

  • Store session objects in a shared database.
  • Synchronize in-memory sessions using a notification mechanism like JMS.
  • Use session affinity to pin users to a particular application server instance.

The problem with the first two approaches is that it takes a long time to serialize a large session object. Whether the target of the serialized object is a database or a network connection for transmission to the other cluster members, serialization is a time-stealer. As a result, most application servers, like WebSphere, combine one or both of the first two approaches with the last approach to reduce the amount of serialization that must occur. For example, only serialize updates, and never read serialized objects from a database if you can avoid it.

To understand how this works, you must consider how session affinity works. This feature allows for most requests from a particular user to return to the same application server. By doing this, the persistent HTTP session is kept in memory in a cache. Session affinity allows the number of accesses to the session persistence database to be reduced by redirecting HTTP requests back to the server that created the HttpSession, rather than allowing them to be sprayed across the entire cluster of available servers. This improves performance by limiting the database access to writes only, except in the case of failure. Whenever a session changes, the database is updated, but the database is only read when the "pinned" application server fails and another server then takes over the job of handling the sessions previously managed by that server. Session affinity was introduced in WebSphere Application Server V3.5, and was introduced at about the same time into most other commercial application servers too.

Even with mechanisms as efficient as above, there is still the problem of writing those changing attributes to the database. For applications with large amounts of session state, this results in poor performance. Thus, we have always recommended that the session be kept to a small size, typically under 2K. Unfortunately, many applications wish to track additional information about a user. After performing detailed application analysis, what we typically find is that the additional user information is not state, but rather a cache of information fetched from some other datastore (for example, a mainframe database. This leads to the discussion of breaking up data into two types: session state and cached information. The cached information is kept in the application server's memory, such as a hash table. A key to the cached data is kept in the session.

By using this approach, since most requests return to the same application server, the in memory cache is used and performance is excellent. But, by dividing the data into two types, we have created a situation where the application development team has to deal with significantly increased complexity. By using this best practice, we encounter the following common problems:

  • Each piece of Web application code that needs to access user information has to know whether this information is in the HttpSession or in the application cache. To prevent this, build a new abstraction to hide this fact.
  • The lifetime of this new cache can be hard to manage. The cached data€„¢s lifetime is normally tied to the lifetime of the HttpSession. Ideally, the cache should remove stale data when the HttpSession is destroyed. This requires the creation of HttpListener logic. And, technically, there is no guarantee that the session termination event is delivered to the same application replica that contains the in-memory state.
  • Existing code that assumes an HttpSession has to be rewritten. Since performance problems are often found late in development, this can be difficult.

Many have struggled with this unfortunate tradeoff for some time. However, there is a third way. What we have found is that we can allow developers to "have their cake and eat it too" by allowing for large HttpSession objects, while avoiding the serialization overhead of these large objects. You can reduce the time to serialize a session by applying the following best practice: provide hints to the container as to what HttpSession data is important and must be preserved and what can be recreated. This is done by using the Java transient keyword.

First, ensure that all of your objects stored in the HttpSession implement the java.io.Serializable interface. This is a best practice for HttpSession development that you should follow because not implementing Serializable can lead to loss of data. However, doing this is crucial to the next part of the best practice.

The next part is to classify the different fields within those objects that represent each of the pieces of data that you need to store across HTTP request/response pairs. You should declare all as transient, except a few of these fields that correspond to key fields used to retrieve records from the database or CICS transaction. This means that when the object is persisted by the Web container session manager, that only those few non-transient fields will be persisted. This reduces the time it takes to write the object out and also the space occupied in the database if one is used. For this to work, all of the transient fields must be re-creatable from some other process, either looking them up from an application database, or fetching them from a back-end system. An example is shown below:

	public class Employee implements Serializable {
		private String employeeId;
		private String employeeName;
	
		private transient Address employeeAddress;

		/**
			Return the cached employee address. If it hasn€„¢t been
		       fetched, retrieve it from the DAO layer.
		*/
		public Address getEmployeeAddress() {
			if (employeeAddress == null) {
				EmployeeDAO dao = new EmployeeDAO();
				employeeAddress =                   
				 dao.fetchAddressFor(getEmployeeId());
			}
			return employeeAddress;
		}

		public String getEmployeeId() {
			return employeeId;
		}

		public String getEmployeeName() {
			return employeeName;
		}

		public void setEmployeeAddress(Address employeeAddress) {
			this.employeeAddress = employeeAddress;
		}

		public void setEmployeeId(String employeeId) {
			this.employeeId = employeeId;
		}

		public void setEmployeeName(String employeeName) {
			this.employeeName = employeeName;
		}	

	}

What happens is that in normal circumstances, each of these transient fields (for example, Address) are retrieved once and then cached in memory in the Employee object contained in the HttpSession. That means the Web application code has fast access to them. However, in the event of a stop of that application server, when the application server "fails over" to another JVM, the Employee object read back in contains (initially) those fields that were non-transient, which are the fields necessary to reconstruct the other transient fields. This technique solves several problems elegantly:

  • All application code that uses the HttpSession is completely unaware of what is managed persistently and what is transient. No more abstraction concerns.
  • The HttpSession can contain as many objects as you like as long as most of the data is transient. That is, the amount of data persisted is small and this results in good performance. Of course, sessions cannot be so large that you run out of memory.
  • Since failover is rare, the data in transient fields are rarely reloaded. In the event of a failover, there is a slightly increased response time, but all succeeding accesses are fast because the values are fetched directly out of memory.
  • As a side benefit, by using lazy instantiation techniques, user data that is not read by the application is never fetched from the back end.

Alternative

We discussed why this solution is better than the primary alternatives: session affinity without persistence, which creates a single point of failure, or not allowing large HttpSessions. Another alternative is the use of a custom session state storage mechanism, such as keeping all session state in a database table specific to the application, and writing customized and tuned SQL code to store and retrieve the entire session state. While that alternative can provide very good performance, we find that the development overhead necessary to implement a full custom session storage mechanism is often prohibitive. The recommended best practice provides a good mix of performance and development costs.


Resources

About the authors

Kyle Brown is Senior Technical Staff Member for the J2EE Architecture and Web Services team. He works with some of IBM's largest WebSphere customers to help them develop best practices to solve complex business problems

Keys Botzum is a senior consultant with IBM Software Services for WebSphere. He hasover 10 years of experience in large scale distributed system design and additionallyspecializes in security. Mr. Botzum has worked with a variety of distributed technologies,including Sun RPC, DCE, CORBA, AFS, and DFS. Recently, he has been focusing onJ2EE and related technologies. He holds a Masters degree in Computer Science fromStanford University and a B.S. in Applied Mathematics/Computer Science from CarnegieMellon University.

Report abuse help

Report abuse

Thank you. This entry has been flagged for moderator attention.


Report abuse help

Report abuse

Report abuse submission failed. Please try again later.


developerWorks: Sign in


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Select information in your profile (name, country/region, and company) is displayed to the public and will accompany any content you post. You may update your IBM account at any time.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


Rate this article

Comments

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=WebSphere
ArticleID=13756
ArticleTitle=Best Practice: Improving HttpSession Performance with Smart Serialization
publish-date=11242003
author1-email=brownkyl@us.ibm.com
author1-email-cc=
author2-email=keys@us.ibm.com
author2-email-cc=