Need help figuring out how to reduce the time it takes to download your Web pages? Find out how to cut download times and improve resource utilization by following the design advice here, gleaned from optimizing efforts at high-volume sites.
You only get one chance to make a first impression. In the case of your Web site, a big part of that first impression is how long the users must wait while your home page downloads. If it takes too long, your visitors may decide not to stay, not to return, and not to tell their friends anything good about you. This matters regardless of what kind of site you run, but download time as a deterrent to repeat visits really hurts an e-commerce site.
Our High-Volume Web Site Team has been working with customers to analyze many of the world's largest Internet and intranet sites (for an article on our work on scalability, see Resources). We evaluated the Web page design factors that contribute to the kind of performance that leads to repeat business. This paper reviews some actual case studies from this analysis and suggests design practices that can improve page performance and perhaps increase your site's capacity for more concurrent visitors.
There is no absolutely correct way to set up a Web site. Design and operational tradeoffs exist in setting up even trivial sites. To quantify the affect of tradeoffs, IBM uses an internal tool that provides performance data about the site, and we run the analysis before and after changes. The tool analyzes a page and displays details about the timing, size, identity, and source of each page item; page owners and site operators use these details to target areas where they can improve performance.
Of the five major elements that influence the end user experience, we measure the performance of three:
- The efficiency of presentation (size)
- The organization of content (packaging)
- The administration of the Web site (delivery)
We do not measure the content of the page (information) or the way that the content is presented (the "look and feel").
Although the page-analysis tool cannot measure content or look and feel, it can provide information useful in improving efficiency, organization, and delivery.
The tool also helps confirm Web page design standards and practices for optimizing performance. We can prototype design standards and practices and analyze them to document performance characteristics to consider along with other factors (for example, marketing issues about presentation appeal). When applied, the tool has revealed areas where significant improvement is possible (and has already been achieved) and suggests page design practices that can help avoid common pitfalls.
Stepping through Web communications
A page's download time results from many factors, and the interaction among them. Page designers who understand these factors can best optimize page download time.
A significant factor is how basic and secured Web site communications work, so we'll walk through the process first and then identify some places where your design decisions can affect performance. The following figure shows a simplified view of Web communications, illustrating the path of a request from a Web browser to the Web server. Note that, depending on the configuration of the Web site, there could be multiple servers of more than one type (data, image, proxy) at more than one location.
Figure 1. A view of Web communications

The communication progresses as follows:
- A
client user enters a request (for example, a
Web site name like www.ibm.com.).
- The
browser application accepts the request, then
typically uses the domain name service (DNS),
a service of the User Datagram Protocol (UDP),
to resolve fully qualified names (FQNs) such
as www.ibm.com into Internet Protocol (IP) addresses
such as 123.321.456.34.
- DNS
builds a connection to the DNS server to obtain
the IP address.
- When
the browser receives the IP address, it initiates
a Hypertext Transfer Protocol (HTTP) request.
HTTP runs on the Transmission Control Protocol
(TCP), which runs on IP, the network-layer protocol
of the Internet. What happens next depends on
whether communications are secured.
- If
communications are not secured, the browser
passes an HTTP request directly through TCP/IP,
which creates a socket, a virtual mechanism
to manage the addressing needed for sending
the request and establishing the connection
to the Web server.
- If
communications are secured, the browser passes
a secure HTTP (HTTPS) request to Socks, a security
package that negotiates for transmission through
the firewall. Such security negotiations occur
both before sending the request and before receiving
the response. The server also refers to the
socket to accept the request and return the
response through the firewall.
Table 1 compares factors that influence download times for two related pages and show why secured pages generally take longer to download.
Table
1. Download measurements compared for secure and
unsecured pages
| Factor measured | Unsecured home page | Secured log-in page |
| Load time (seconds) | 6.112 | 13.59 |
| Size (bytes) | 29853 | 35150 |
| Number of items | 8 | 12 |
| Numbers of servers accessed | 1 | Unknown |
| Number of connections | 5 | 6 |
| Failed connections | 0 | 0 |
| Total connection time | 5.058 sec/31% | 3.271sec/9.95% |
| Average connection time | .561 sec | 0.27 sec |
| Total SSL connection time | Not applicable | 4.433 sec/13% |
| Total server response time | 7.936 sec/48% (little high) | 10.999 sec (SSL)/33% |
| Average server response time | .881 sec | 0.916 (SSL) |
| Total delivery time | 3.206 sec/19% | 14.174 (SSL)/43% |
| Average delivery time | .356 sec | 1.181 sec (SSL) |
| Address resolution time | .579 sec/1% | 0 |
When the complete end-to-end connection is established, the server fulfills the request by obtaining and serving the items that make up the page. A page includes one or more text files (usually HTML), graphic image files (GIFs), and possibly audio clips, video clips, and applets. The HTML specification determines the format and content of the page. The operation of sending a file to a client is referred to as a hit on the server. The time from the browser's request through receipt of the initial reply is called server response time.
The design of the Web communication protocols creates times when either the Web browser or the Web server must wait for responses from other components. The more time spent in these protocol waits, the longer the delay site visitors experience while waiting for page content. The further the browser is from the server, the greater the likelihood of delays due to intermediate links or devices in the path between the browser and the server. A delay could occur at any hardware or software component, including components or subsystems of the browser or the server itself. Even in the best cases, the links and devices in the path act as variable time amplifiers. Each link or device in the path adds a fixed amount of time to perform its function and also has the potential to add significant delay due to queuing related to component saturation. (For an article on the best practices for scaling Web site capacity, see Resources.)
Setting acceptable download times
Many designers of the first generation Web sites optimized their design for graphic appeal and relied on the newness of the Web environment to attract visitors and the variety of "eye candy" they could offer to keep visitors there and bring them back. Even today, designers who redesign pages sometimes produce pages that are, from the visitor's viewpoint, prettier but worse (as in slower), than their predecessor pages. This is no longer acceptable. Web sites that permit customers to transact business must offer their information and services in a way that meets the customers' needs and brings them back for more. That means performance, usually measured in response time to a customer's request. The goal is to achieve a perfect balance of content and performance.
The major factors that contribute to download time are page size (in kilobytes), number and complexity of items, number of servers accessed, and whether SSL is used, as shown in Table 1. Table 2 points out the respectable numbers in each of those crucial measurements -- for both the secured and unsecured pages.
Table
2. Respectable results for three key factors:
load time, size, and number of items
| Factor measured | Unsecured home page | Secured log-in page |
| Load time (seconds | 6.112 | 13.59 |
| Size (bytes) | 29853 | 35150 |
| Number of items | 8 | 12 |
We've concluded the measurements shown in Table 3 are acceptable for a dial-up modem connection:
Table
3. Acceptable download times for dial-up modem
connections
| Factor measured | "Acceptable" standard |
| Average server response time | less than 0.5 seconds |
| Number of items per page | fewer than 20 items |
| Page load time | less than 30 seconds |
| Page size in bytes | less than 64K |
While acceptable, the baseline measurements in Table 2 do not qualify as world class. As an example, acceptable page load time for a dial-up connection is less than 30 seconds. You'd have to cut that down to less than 20 seconds to rank as world class. Rather than settling for acceptable, compare your results to the ranking in Table 4 to find out how your load times stack up.
Table
4. Ranking dial-up modem page download times from
world class to unacceptable
| Seconds to load | Ranking |
| Less than 10 | Excellent |
| 10-15 | Very good |
| 15-20 | Good |
| 20-25 | Adequate |
| 25-30 | Slow |
| More than 30 | Unacceptable |
IBM's page-analysis tool is designed to explain some of the mysteries about how Web pages are delivered to Web browsers, and to help designers and site operators improve performance and user satisfaction. It does this by revealing details about the timing, size, identity, and source of each item that makes up a page. The details revealed can be used to identify areas where performance improvement could enhance the end user experience.
A view of the download measurements for each element in a page, as shown in Figure 2, can help you zero in on opportunities for improving performance. Here, the purple bar represents the total download time for the page. The other bars show the timing, size, identity, and source for each page item. The colors in the page element bars represent different measurements.
Figure 2. Specific download times for page elements pinpoint trouble spots

Characteristics of fast-loading pages
Generally speaking, the pages that load the fastest:
- Present a few simple, small items, selected for their business value
- Retrieve items from a single server
- Combine requests for multiple items from the same server
- Use persistent connections
- Request items early
- Store and retrieve items used more than once from the browser cache
- Assign private information to private pages and secure only private data
- Use preproduction utilities that remove extra white space from the source HTML
Pages with features that enable visitors to keep moving appear to load fast, which, from a visitor's standpoint, can be nearly as valuable as actually loading fast. Pages with these features:
- Present the links to the site's major sections near the top of the page
- Use direct links
- Label visual components that are hot
As we analyzed pages, we formulated some guidelines for designing pages that load swiftly. Let's look at how the features of fast-loading pages translate into page design practices. Then we'll discuss the related constraints and tradeoffs.
A common theme among recommended practices is moderation. For example, any page is likely to have multiple items and require multiple connections. What's important is that page designers consider every item first for its business value and second for its size and complexity. Page designers can control for size and complexity and must ensure that the item's business value, size, and complexity justify the time each contributes to the overall download time. Page designers can also influence the number and types of connections and must understand how the choices they make affect download time.
Also consider that the people preparing the pages rarely see them as the end users do. For efficiency, Web designers and developers tend to locate themselves in proximity to the Web server they are working with. Most try to be on the same LAN. By contrast, Web site visitors tend to be farther away and may be using dial-up connections with considerably slower speeds. From their viewpoint, the Web designers may not see much difference in response time, but the site visitor will see the benefits of thoughtful packaging. Consider making it policy that developers regularly view pages in progress using connections that are typical for the target users.
Web pages have common components and characteristics that you can and should manage to minimize download time. Doing the "right" thing will not always be possible, and some components or characteristics may be outside of the control of the page designer. Still, everyone with an interest in the site's performance should understand these factors and their related tradeoffs:
- Number, size, and complexity of items
- Number of connections
- Number of servers accessed
- Use of white space
- Load sequences
- Data security
Manage
the number, size, and complexity of items
The number, size, and complexity of page items
is the single most significant contributor to
page size, page complexity, and the time it takes
to download the page. Quite simply, pages with
a few, simple items -- selected for their business
value -- load the fastest and yield the most satisfied
visitors.
Number of items: It's impossible to generalize about the correct number of items. After selecting the required items (remembering to exclude items that lack business value), use techniques that help minimize download time:
- Send a menu as a browser or client-side map instead of a table with individual graphic elements. Tables are inherently slower, especially those with graphic elements.
- Combine items so the Web server requires fewer machine cycles to retrieve and deliver content.
- Avoid rollover GIFs. Using mouse rollovers that dynamically change the displayed GIF looks interesting, but it requires additional GIFs to download for the effect to operate. Eliminating the rollover GIF can reduce the total number of items for the page.
Additional techniques exist, although most trade off some amount of interface function for a reduction in the number of items.
Size: Balance each item's size in relation to its function or information content. Larger items always take longer to load, but larger items don't necessarily deliver more information or better function.
Complexity: The complexity of a page affects how quickly it can be presented. Consider the delays involved when you choose items with features that add complexity. Factors that contribute to page complexity include large tables, table cells whose sizes are dynamically calculated, Java scripts, and Java applets. Animated GIFs, image color management, and image dithering can also contribute delays. The delays vary from browser to browser, and from level to level within a browser; thankfully they tend to become faster with new levels, but not always.
Items formed poorly or described poorly or incompletely can suspend the browser to socket communication. Some tables may be so complex that the browser is fully occupied by their operation and cannot service its socket connections. Time goes on, but nothing in-progress ends. The offending item is probably already at the browser and is being acted on when the hang occurs. Servers and networks can hang also, but a browser hang is almost always reproducible. Such hangs can lead to lost connections, requiring resources to reconnect, adding to overall load time and dissatisfaction of visitors.
The number and size of HTML files are indicators of page complexity. While HTML coding is outside the scope of the team's analysis, we do know that utility programs, such as GZIP, can compress HTML files. We have seen GZIP reduce the size of a HTML file by 80% to 90%. A smaller HTML file reduces download time and permits the browser to start presenting the page sooner.
Manage
the number of connections
Information from a Web server reaches the Web
browser by way of TCP/IP socket connections. The
connection must be opened on both ends before
page information can flow. Each connection takes
time to set up and take down, and some connections
inherently take more time than others. Consider
the following for each required connection:
- Persistent connections can reduce connection setup overhead if multiple items must be transmitted
- Secured connections take more time to set up
A Web site can have some control over whether it leaves open a socket connection or closes it after delivering an item. If the Web site closes connections, the browser must establish a new connection for each item. This type of connection overhead can significantly extend the delay visitors experience when loading pages. Most browsers attempt to keep the connection on their end open, but both ends have to agree that the connection can be held open. The choice to keep a connection open is usually made at the Web server by way of a server configuration option to determine whether the server will support the use of persistent connections when the browser is capable of persistent connections.
If you run a high-volume Web site, you may prefer not to maintain persistent connections because such connections can lead to consumption of all available ports or other constrained server resources, like threads. You may have to allocate additional resources at the server to support persistent connections.
Aim for no more than four connections per page. As servers and HTTP server software evolve, they expand their limits where resources become constrained. Site visitors may benefit when the connections at the server end are kept open.
Manage
the number of servers accessed
In the best of all possible worlds, the few, simple
items would reside on the same server, yielding
the fastest possible download time. In the real
world, however, page items often reside on more
than one server. These items may be from servers
at one site or servers across multiple sites.
Each time the user accesses another server, the
browser must connect a socket to the new server.
If all of the browser's connections are being
held open, an existing connection must be broken
in order to connect to the new server. Often additional
items are required from the first server, and
then the connection must be reestablished. When
possible, organize items to come from a single
server to avoid the time wasted to breaking and
reopening connections. When you must use multiple
servers, combine requests from the same server
to take advantage of open connections.
Banner ads, for example, usually reside on a different server from the base page. When including a banner, specify the banner's dimensions in the base HTML so the browser does not have to calculate the size. Some browsers do not display any content beyond the banner until it has retrieved the banner and calculated its size. Other browsers may start to display the page and then flicker as the size of arriving ad images is calculated and the page is reformatted.
Web sprayers may be in use in front of a Web server's address. A Web sprayer appears as one address to the browser but actually hides multiple servers providing content. Items with long composite times may be coming from an address different from the base page's HTML. Depending on browser design, such items may hold up displaying the page.
Some sites use a technique that allows multiple names for the same server address and can make the site easier and faster to find. For example, http://www.xyznewscom and http://xyznews.com refer to the same site. If you use this technique, try to avoid switching to the other server name during the page load. Some sites handle the situation by returning a page stub that refers the Web browser to the other name. This results in making the Web browser use more time to look up the other name for the site and establish a new connection before it can retrieve the page content.
Use direct links whenever possible to avoid the cost of an intermediate page. Redirection is best reserved for pointing browsers to a new set of pages when loaded from an old bookmark.
Using the browser's memory cache can reduce download time for a page with multiple requests for the same item. For example, designers use spacer GIFs to position page items. Rather than consecutive requests for spacer GIFs, it's preferable to request a spacer GIF, then request other items. This allows the server time to serve the GIF into the browser's cache so that the GIF is available for subsequent requests. Retrieving the GIF from the cache is faster than returning to the server for each request.
Manage
the use of white space
Judicious management of white space can help achieve
acceptable download times and may even extend
the time before a server needs to be added.
Page designers often use white space to help them visualize the page presentation. The browser doesn't need the extra white space to operate properly. Consider using available utilities to remove the extra white space in the source HTML before placing the page(s) on the production Web server.
Avoid the use of extra white space on pages that require encryption. While extra white space in clear text can be compressed well across a dial-up line, encrypted white space does not compress well because it is no longer a string of repeated character symbols. After encryption, each block of repeating spaces is usually represented by a unique byte string, making them less likely to be compressed by the modems. Each extra byte costs something to deliver yet provides no improvement for site visitors.
Manage
load sequences
Designers can sequence requests for items in such
a way that download time is optimized. The objective:
specify the sequence in a way that concurrent
operations allow the page to load smoothly. Request
items early -- especially large items and those
required for navigation -- to avoid delay at the
end of the load sequence. Ideally, the browser
should be able to identify items in time to keep
its connections to the server busy.
Understand
the impact of data security
SSL (privacy) handshakes and encryption can consume
time on both ends. Because privacy creates a drag
on each item, it is essential to design encrypted
pages concisely. The impact of all the recommended
design practices multiplies when you're designing
encrypted pages.
HTML items on an encrypted page do not compress well because the HTML is converted to long sequences of numbers that do not work well with dial-up modem compression schemes. This makes avoiding unecessary white space even more significant on an encrypted page.
Clearly, information that is private must be kept private. The balance here is to assign private information to private pages and public information to public pages: avoid mixing private and public data. Do not squander the overhead required for the private information on any public information.
Satisfying customers... and more
Of course, your business success depends on satisfied customers. Achieving healthy rates of repeat business on your Web site is your objective. The design practices suggested herein can help you improve the performance of your Web site, which can surely contribute to customer satisfaction. At a minimum, your page designers should adopt these practices as part of their design task:
- Manage the number, size, and complexity of items
- Manage the number of connections
- Manage the number of servers accessed
- Manage the use of white space
- Manage load sequences
- Understand the impact of data security
Ideally, enhance your design team with specific performance expertise. Thus enhanced, your team will be equipped to develop sites that satisfy customers, reduce consumption of valuable IT resources, and simplify your operations. When implemented, the recommended design practices may also help you increase the capacity of your site as you serve more concurrent visitors with the same hardware.
- Check
out the IBM High Volume Web Site Team's suggestions
for the best practices for scaling Web site
capacity, in Design
for Scalability (December 1999).
- Read
the High Volume Web Site Team's recommendations
on Web
Site Personalization (February 2000).
- Killelea,
Patrick. Web Performance Tuning. O'Reilly
& Associates. 1998
The IBM High-Volume Web Site team is grateful to the major contributors to this article: Mike Amerson, Gerry Fisher, Larry Hsiung, LeRoy Krueger, and Nat Mills. For more information, contact Willy Chiu at wchiu@us.ibm.com.
Comments (Undergoing maintenance)





