Skip to main content

Tip: Avoid unnecessary Ajax traffic with session state

Server-side caching and HTTP error codes

David Mertz (mertz@gnosis.cx), Efficient server, Gnosis Software, Inc.
Photo of David Mertz
David Mertz feels that bandwidth saved is bandwidth earned. Pore over David's life or buy his book Text Processing in Python .

Summary:  Where possible, creating Web applications — including Ajax-based applications — in a RESTful way avoids a large class of bugs. However, a pitfall of REST (REpresentational State Transfer) is sending duplicate data across similar XMLHttpRequests. This tip shows how the moderate use of session cookies can maintain just enough server-side state to significantly reduce client-server traffic, while still allowing fallback to cookie-free operation.

Date:  13 Nov 2007
Level:  Intermediate
Activity:  5792 views

Introduction

A simple fact about HTTP is both its greatest strength and its central weakness: HTTP is a stateless protocol. Each request to an HTTP server resource is meant to be idempotent, which is to say the same request should return the same result at each invocation. Idempotency is the central idea in REST: the same request — perhaps encoding client information — should return the same data whenever it is made.

In contrast to the REST philosophy, Ajax applications are often very stateful. Some field or region in a Web application reflects the current state of some server data, with client JavaScript polling used to query that current state periodically (there are ways to make this more push-oriented, but that is not essential to this tip). The Web application, however, more or less expects the server to keep track of what it needs to know on the next polling event: what data the client has seen or not seen, what interactions have already occurred, and so forth.

One common means of making Ajax applications technically RESTful is to arrange matters such that every query for the latest data has a globally unique URI. For example, a query might incorporate a UUID, either in a URL-encoded parameter or with a hidden form variable; for instance, an XMLHttpRequest object might GET the following resource:

http://myserver.example.com?uuid=4b879324-8ec0-4120-bba6-890eb0aa3fc0

On the very next polling event, even if it is merely one second later, a different URI would be opened.

Idempotency is tricky

Understanding the meaning of "same data" is more subtle than it might appear. Only in caricature must the same URI always return identical data. After all, even a static Web page might change when the content is corrected (say, the typos are fixed in a published article). The idea behind idempotency is merely that the change involved should not be a direct effect of the GET request itself. So having a constantly changing resource like this is a perfectly reasonable approach:

http://myserver.example.com/latest_data/

The issue is merely that what makes up "latest_data" depends on something other than merely whether, when, and by whom this data has been retrieved. A server can be perfectly RESTful and still reflect "the state of the world."

Getting The Latest

A colleague of mine, Miki Tebeka, and I faced exactly this situation of developing a Web application that frequently polled the latest data from a server, with a JavaScript XMLHttpRequest() object. The Python server example I present here is inspired by a nice in-house module Miki created, but simplified and improved.

There were two problems we wanted to solve here. One was avoiding sending any substantial message at all when nothing had changed since the prior request. The second problem was avoiding excessive use of database or computational resources in generating duplicate data.

The "Not Modified" problem is, in fact, addressed right in the HTTP protocol, though this correct solution is underused. What we may — and should — do is simply return an HTTP 304 status code. It is the responsibility of our Ajax code to check for 304 status, and if found, simply not to change client application state based on the (absence of) data sent from the poll.

The server resource issue can be addressed by caching prior data and then aggregating the very newest additions. This solution is generally only relevant if "latest data" consists of relatively discrete items of data, not as much if the entire data set is interdependent. We can track the cached state of the client session by using a client cookie. Listing 1 puts it together:


Listing 1. Session-enabled server code: server.cgi
from datetime import datetime
session = ClientSession()
old_stuff = session.get("data", [])   # Retrieve cached data
last_query = session.get("last", None)
prune_data(old_stuff, last_query)     # Age out really-old data
new_stuff = get_new_stuff()           # Look for brand-new data

if not new_stuff:
    print "Status: 304"               # "Not Modified" status
else
    print session.cookie              # New or existing cookie
    print "Content-Type: text/plain"
    print
    all_stuff = old_stuff + new_stuff
    session["data"] = all_stuff
    session["last"] = datetime.now().isoformat()
    print encode_data(all_stuff)      # XML, or JSON, or...
session.save()

A slight bit of cleverness goes into the ClientSession class, but not all that much. Basically, we just need to keep track of each client who might have a cookie corresponding to cached old_stuff:


Listing 2. Maintaining the session
from os import environ
from Cookie import SimpleCookie
from random import shuffle
from string import letters
from cPickle import load, dump

COOKIE_NAME = "my.server.process"

class ClientSession(dict):
    def __init__(self):
        self.cookie = SimpleCookie()
        self.cookie.load(environ.get("HTTP_COOKIE",""))

        if COOKIE_NAME not in cookie:
            # Real UUID would be better
            lets = list(letters)
            shuffle(lets)
            self.cookie[COOKIE_NAME] = "".join(lets[:15])

        self.id = self.cookie[COOKIE_NAME].value
        try:
            session = load(open("session."+self.id, "rb"))
            self.update(session)
        except:       # If nothing cached, just do not update
            pass

    def save(self):
        fh = open("session."+self.id, "wb")
        dump(self.copy(), fh, protocol=-1)  # Save the dictionary
        fh.close()

Making the Ajax call

With the caching server in place, the JavaScript to poll its data is quite simple. All we need is something along the lines of Listing 3:


Listing 3. Polling the server for latest data
var r = new XMLHttpRequest();
r.onreadystatechange=function() {
    if (r.readyState==4) {
        if (r.status==200) {  // "OK status"
            displayData(r.responseText);
        }
        else if (r.status==304) {
            // "Not Modified": No change to display
        }
        else {
            alertProblem(r);
        }
    }
}
r.open("GET",'http://myserver.example.com/latest_data/',true)
r.send(null);

The implementation of displayData() and alertProblem() are not specified in our simple example. Presumably, the former needs to parse or process the received response in some manner; the details depend on whether JSON, XML, or some other format is used to send the data, as well as on the actual application requirements.

Moreover, the quick example only shows how to poll one time. In a long-running application, you might repeatedly make this request in a setTimeout() or setInterval() callback. Or, depending on your application, polling might occur following some particular client application action or event.


Summary

This tip presented some server code written in Python, but almost the same design would apply for nearly any language that might be used to program a CGI or other server process. The general idea is simple: Use a client cookie (if available) to identify the cached data, and send a 304 status if no new data has arisen since the last polling event. Whatever your server programming language, your program will look almost the same.

While I have not shown much error catching, the design involved is robust in falling back to correct behavior where cookies are not available. If a client does not have a relevant session cookie — either because it does not accept cookies or because this is a first poll in a new session — old_stuff is simply an empty list, and any data returned will be part of new_stuff. Another capability often worth adding is a special client message that will send any current session state: this is useful both for application debugging and as a way of clearing out inconsistent state should the client detect that something has gone wrong. All you lose in flushing the cache is a little server load and some bandwidth; it does not violate underlying idempotency.


Resources

  • As with many topics, Wikipedia provides a nice introduction to the principles behind REST.

  • The developerWorks SOA and Web services zone is packed with information on session state and REST principles.

  • With the growth in popularity of building Ajax functionality into Web applications, you'll want to check out the developerWorks Ajax resource center where you'll find tools, code, education, and resources to get you started building Ajax into your applications right now.

About the author

Photo of David Mertz

David Mertz feels that bandwidth saved is bandwidth earned. Pore over David's life or buy his book Text Processing in Python .

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Web development, XML, SOA and Web services
ArticleID=269075
ArticleTitle=Tip: Avoid unnecessary Ajax traffic with session state
publish-date=11132007
author1-email=mertz@gnosis.cx
author1-email-cc=ruterbo@us.ibm.com

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Rate a product. Write a review.

Special offers