Comet is a term describing the interaction of a client and server that uses long-lived HTTP connections to enable event-driven communication between the client and server while the connection remains open. Ever since Alex Russell defined the term, in Comet: Low Latency Data for the Browser (see Resources), Comet has become one of the Web 2.0 buzz words. Comet style applications maximize connection timeouts and infrastructure to provide updates to the browser more quickly than other solutions and with less data sent -- which sounds great. But there are drawbacks to Comet style connections you should know about before you consider using them anytime soon.
The most common problem many developers in a Web 2.0 world have dealt with is streaming events generated on the server to the clients. There are three popular ways to solve this problem:
-
Polling
In this method, the Javascript™ running in your browser sends a request, at a configured interval, to check if any events happened that it should receive. The responses from the server are immediate: either the events that have happened or there are no events. If the interval at which the client sends requests is too short, there can be a significant performance penalty. If the interval is too long, the event notifications may arrive later than the client desires.
Figure 1. Polling
-
Comet long-polling or hybrid polling
Applications using hybrid polling take some of the advantages of a Comet style application and some of the advantages of a polling-style application. The Javascript in the client's browser makes an initial request for data from the back end server. The server might respond almost immediately, but keeps the socket open to finish writing the response, if one occurs during the interval that it is open. For example, the server might keep the socket open for 30 seconds, and if an event occurs in that timeframe, it immediately writes it back. But if there is no event in that timeframe, or if the client has more data to send to the server, the connection is closed and the client reopens another connection after a certain amount of time. Some implementations open a read channel as a long HTTP GET, while a write channel opens and closes when needed as an HTTP POST. XMLHttpRequest in Javascript is often implemented in this way.
Figure 2. Hybrid polling
-
Comet streaming or forever frame
In a Comet streaming style application, the client opens a connection and sends a data chunk encoded, with the server sending data chunked encoded in response. After the initial connection is created, there is very small overhead to transmit data either way. The connection stays open as long as possible, and each set of bytes contains only the chunked encoding portion overhead, which is a hexadecimal number representing the size of the data being sent and a carriage return line feed, typically under 10 bytes.
Figure 3. Comet streaming
Advantages of Comet streaming and hybrid polling
Both Comet streaming and hybrid polling applications permit two-way communication once the initial connection is set up. Comet streaming is the only technology pattern available today that offers that immediate two-way communication without a browser plug-in. Hybrid polling allows for immediate notification as long as the events occur when the connection is up. Hybrid polling and other patterns using Javascript in a browser require server-side logic to cache the events until the client connects to receive them. Here are a summary of some benefits to Comet streaming:
- Browser support without a plug-in.
- Immediate notification of clients; no need to cache events.
- Reduced CPU load on the server.
- Much less overhead with respect to number of bytes to send events and information.
Disadvantages of Comet streaming
Sadly, most of the Internet infrastructure is not ready for prolific use of Comet streaming style applications today. Here are some of the reasons why I claim this:
-
Standards are behind
The HTTP protocol was designed such that when a request begins to be sent, the connection that request is being sent on is virtually locked until the response is sent back in its entirety. This means that during the duration of a Comet streaming style communication, at least one connection is tied up on each system from the client device to the backend server.
-
Synchronous vs asynchronous
Many pieces of the Internet infrastructure are synchronous in nature. This includes servers (such as different firewalls) and programming models (such as HTTP servlets). Accordingly, the Internet infrastructure of today is generally not built for CS style applications to be pervasive, since most of these systems have a limited number of threads and each Comet connection ties up a thread.
-
Limited connections
Proxy servers and firewalls will struggle with large scale Comet streaming applications. Why? Today's operating systems are generally not prepared for pervasive use of Comet streaming. Because of the issue mentioned above about the HTTP protocol, each user of a Comet streaming application will tie up two connections and file descriptors in each firewall and proxy that the HTTP request flows through. Many operating systems -- and the software built on those operating systems -- are limited to less than 65,000 connections across the entire machine at one time. This means that these systems could run out of available connections before they run out of CPU or memory, requiring more systems -- when there is plenty of CPU and memory available on existing systems.
-
Working as designed
Some systems are not designed to handle this sort of application at all. These systems do things such as buffer HTTP chunks until you reach a certain size before sending it on to clients. There are firewalls that will not send the HTTP request through to the server until the complete request is received and able to be inspected in its entirety. Sometimes there are settings to change the behavior, other times it just will not work.
-
Browsers are not really prepared for streaming
The APIs that exist today are not geared toward sending a simple chunk of data and receiving a simple chunk of data. This is sometimes achieved by implementing a new client side applet or something similar, but it is very difficult to support today. Many people revert to the hybrid polling mechanism and send the data in a separate connection.
What needs to happen to better support Comet streaming?
I think there are many things we can do to better support Comet streaming style applications. First, we need to fix the problems above.
-
Standards
HTTP pipelining is broken in many people's eyes since it does not permit out of order chunks across a single connection. With sequence numbers and request identification, a browser, firewall, proxy, or application server could stream more than one Comet streaming request over a single connection.
-
Synchronous vs. asynchronous
Our applications and infrastructure need to be updated to handle everything asynchronous. This includes things such as the HTTP Servlet specification, which could learn from the Continuations support in Jetty, or the asynchronous nature of SIP servlets. Most of us expect that the Servlet 3.0 specification will be asynchronous.
-
Limited connections
If there is more advanced HTTP pipelining support, then this may not be needed. But, in lieu of that change, systems need to expect a higher number of connections than they do today.
-
Better JavaScript APIs
Better APIs are needed to send and receive chunked messages to a Comet streaming style server-side application.
When these issues are addressed, then Comet streaming will be more widely accepted as it helps increase the interactivity of Web-based interfaces. Stock quotes, chat, sports scores, and many other applications will work without pain throughout the infrastructure. Until then, I am afraid that every piece of server software on the Internet will need to be reviewed for whether it can handle this new Comet-streaming based world.
Should we use hybrid polling?
Hybrid polling enables fast -- but not necessarily immediate -- notifications. This methodology is preferable with today's infrastructure for these reasons:
- Javascript APIs (specifically XMLHttpRequest) better support this methodology.
- Hybrid polling is easier on the server infrastructure because there is a break between requests.
The shorter the two intervals are -- the client polling interval and how long the server will hold the connection without responding -- will determine the resource usage that hybrid polling will incur across the infrastructure. Keep the server hold time to a minimum and client poll time to a maximum, if at all possible.
- Limit your usage of Comet streaming to only those applications that require immediate event notifications.
- Use methodologies, such as hybrid polling, that enable faster event notification without a large toll on the infrastructure. Tune your hybrid polling times to keep connections closed as much as possible.
- Understand the limitations of the new infrastructure you are purchasing. For example, if you are buying a firewall, it helps to know the maximum number of connections it can handle, whether it supports Comet style applications, and whether the function you want (like intrusion detection) requires the buffering of messages.
-
Comet: Low Latency Data for Browsers
-
Ajax for Java developers: Write scalable Comet applications with Jetty and Direct Web Remoting
-
IBM WebSphere Application Server Feature Pack for Web 2.0

Erik Burckart is a lead architect of the WebSphere Application Server product. He is a graduate from the University of Pittsburgh'’s School of Information Science, where he studied telecommunications, software development, and human computer interaction. Through his work with SIP servlets in WebSphere Application Server, he has joined the SIP Servlet 1.1 (JSR 289) Expert Group and has made numerous contributions in combining the state of the art Java EE platform with the latest SIP Servlet specification.
Comments (Undergoing maintenance)





