Comment lines: Erik Burckart: The allure of Comet

Temptation vs. preparation

Comet style applications are becoming more and more popular in the Web 2.0 world. However, there are many challenges with Comet, not to mention that the infrastructure on which your application will be deployed might not yet be ready for a Comet application. This content is part of the IBM WebSphere Developer Technical Journal.

Erik Burckart (ejburcka@us.ibm.com), WebSphere Application Server Lead Architect, IBM

Erik BurckartErik Burckart is a lead architect of the WebSphere Application Server product. He is a graduate from the University of Pittsburgh'’s School of Information Science, where he studied telecommunications, software development, and human computer interaction. Through his work with SIP servlets in WebSphere Application Server, he has joined the SIP Servlet 1.1 (JSR 289) Expert Group and has made numerous contributions in combining the state of the art Java EE platform with the latest SIP Servlet specification.



07 November 2007

Also available in Chinese

It's a bird, it's a plane...

Comet is a term describing the interaction of a client and server that uses long-lived HTTP connections to enable event-driven communication between the client and server while the connection remains open. Ever since Alex Russell defined the term, in Comet: Low Latency Data for the Browser (see Resources), Comet has become one of the Web 2.0 buzz words. Comet style applications maximize connection timeouts and infrastructure to provide updates to the browser more quickly than other solutions and with less data sent -- which sounds great. But there are drawbacks to Comet style connections you should know about before you consider using them anytime soon.

Problems solved (and created)

The most common problem many developers in a Web 2.0 world have dealt with is streaming events generated on the server to the clients. There are three popular ways to solve this problem:

  1. Polling

    In this method, the Javascript™ running in your browser sends a request, at a configured interval, to check if any events happened that it should receive. The responses from the server are immediate: either the events that have happened or there are no events. If the interval at which the client sends requests is too short, there can be a significant performance penalty. If the interval is too long, the event notifications may arrive later than the client desires.

    Figure 1. Polling
    Figure 1. Polling
  2. Comet long-polling or hybrid polling

    Applications using hybrid polling take some of the advantages of a Comet style application and some of the advantages of a polling-style application. The Javascript in the client's browser makes an initial request for data from the back end server. The server might respond almost immediately, but keeps the socket open to finish writing the response, if one occurs during the interval that it is open. For example, the server might keep the socket open for 30 seconds, and if an event occurs in that timeframe, it immediately writes it back. But if there is no event in that timeframe, or if the client has more data to send to the server, the connection is closed and the client reopens another connection after a certain amount of time. Some implementations open a read channel as a long HTTP GET, while a write channel opens and closes when needed as an HTTP POST. XMLHttpRequest in Javascript is often implemented in this way.

    Figure 2. Hybrid polling
    Figure 2. Hybrid polling
  3. Comet streaming or forever frame

    In a Comet streaming style application, the client opens a connection and sends a data chunk encoded, with the server sending data chunked encoded in response. After the initial connection is created, there is very small overhead to transmit data either way. The connection stays open as long as possible, and each set of bytes contains only the chunked encoding portion overhead, which is a hexadecimal number representing the size of the data being sent and a carriage return line feed, typically under 10 bytes.

    Figure 3. Comet streaming
    Figure 3. Comet streaming

Advantages of Comet streaming and hybrid polling

Both Comet streaming and hybrid polling applications permit two-way communication once the initial connection is set up. Comet streaming is the only technology pattern available today that offers that immediate two-way communication without a browser plug-in. Hybrid polling allows for immediate notification as long as the events occur when the connection is up. Hybrid polling and other patterns using Javascript in a browser require server-side logic to cache the events until the client connects to receive them. Here are a summary of some benefits to Comet streaming:

  • Browser support without a plug-in.
  • Immediate notification of clients; no need to cache events.
  • Reduced CPU load on the server.
  • Much less overhead with respect to number of bytes to send events and information.

Disadvantages of Comet streaming

Sadly, most of the Internet infrastructure is not ready for prolific use of Comet streaming style applications today. Here are some of the reasons why I claim this:

  • Standards are behind

    The HTTP protocol was designed such that when a request begins to be sent, the connection that request is being sent on is virtually locked until the response is sent back in its entirety. This means that during the duration of a Comet streaming style communication, at least one connection is tied up on each system from the client device to the backend server.

  • Synchronous vs asynchronous

    Many pieces of the Internet infrastructure are synchronous in nature. This includes servers (such as different firewalls) and programming models (such as HTTP servlets). Accordingly, the Internet infrastructure of today is generally not built for CS style applications to be pervasive, since most of these systems have a limited number of threads and each Comet connection ties up a thread.

  • Limited connections

    Proxy servers and firewalls will struggle with large scale Comet streaming applications. Why? Today's operating systems are generally not prepared for pervasive use of Comet streaming. Because of the issue mentioned above about the HTTP protocol, each user of a Comet streaming application will tie up two connections and file descriptors in each firewall and proxy that the HTTP request flows through. Many operating systems -- and the software built on those operating systems -- are limited to less than 65,000 connections across the entire machine at one time. This means that these systems could run out of available connections before they run out of CPU or memory, requiring more systems -- when there is plenty of CPU and memory available on existing systems.

  • Working as designed

    Some systems are not designed to handle this sort of application at all. These systems do things such as buffer HTTP chunks until you reach a certain size before sending it on to clients. There are firewalls that will not send the HTTP request through to the server until the complete request is received and able to be inspected in its entirety. Sometimes there are settings to change the behavior, other times it just will not work.

  • Browsers are not really prepared for streaming

    The APIs that exist today are not geared toward sending a simple chunk of data and receiving a simple chunk of data. This is sometimes achieved by implementing a new client side applet or something similar, but it is very difficult to support today. Many people revert to the hybrid polling mechanism and send the data in a separate connection.

What needs to happen to better support Comet streaming?

I think there are many things we can do to better support Comet streaming style applications. First, we need to fix the problems above.

  • Standards

    HTTP pipelining is broken in many people's eyes since it does not permit out of order chunks across a single connection. With sequence numbers and request identification, a browser, firewall, proxy, or application server could stream more than one Comet streaming request over a single connection.

  • Synchronous vs. asynchronous

    Our applications and infrastructure need to be updated to handle everything asynchronous. This includes things such as the HTTP Servlet specification, which could learn from the Continuations support in Jetty, or the asynchronous nature of SIP servlets. Most of us expect that the Servlet 3.0 specification will be asynchronous.

  • Limited connections

    If there is more advanced HTTP pipelining support, then this may not be needed. But, in lieu of that change, systems need to expect a higher number of connections than they do today.

  • Better JavaScript APIs

    Better APIs are needed to send and receive chunked messages to a Comet streaming style server-side application.

When these issues are addressed, then Comet streaming will be more widely accepted as it helps increase the interactivity of Web-based interfaces. Stock quotes, chat, sports scores, and many other applications will work without pain throughout the infrastructure. Until then, I am afraid that every piece of server software on the Internet will need to be reviewed for whether it can handle this new Comet-streaming based world.

Should we use hybrid polling?

Hybrid polling enables fast -- but not necessarily immediate -- notifications. This methodology is preferable with today's infrastructure for these reasons:

  • Javascript APIs (specifically XMLHttpRequest) better support this methodology.
  • Hybrid polling is easier on the server infrastructure because there is a break between requests.

The shorter the two intervals are -- the client polling interval and how long the server will hold the connection without responding -- will determine the resource usage that hybrid polling will incur across the infrastructure. Keep the server hold time to a minimum and client poll time to a maximum, if at all possible.

What can be done now?

  • Limit your usage of Comet streaming to only those applications that require immediate event notifications.
  • Use methodologies, such as hybrid polling, that enable faster event notification without a large toll on the infrastructure. Tune your hybrid polling times to keep connections closed as much as possible.
  • Understand the limitations of the new infrastructure you are purchasing. For example, if you are buying a firewall, it helps to know the maximum number of connections it can handle, whether it supports Comet style applications, and whether the function you want (like intrusion detection) requires the buffering of messages.

Resources

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Select information in your profile (name, country/region, and company) is displayed to the public and will accompany any content you post. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into WebSphere on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=WebSphere
ArticleID=266674
ArticleTitle=Comment lines: Erik Burckart: The allure of Comet
publish-date=11072007