Pure speed with mod_pagespeed

Let Google accelerate your website

mod_pagespeed is a module from Google for Apache HTTP Servers that can improve the page load times of your website. It programmatically and automatically incorporates all the best practices for a speedy website into your site, and requires only minimal configuration. With mod_pagespeed, Apache web hosters can improve website speed quickly and easily.

Share:

Michael Abernethy, Programmer, Freelance

Mike AbernethyMichael Abernethy has worked with a wide variety of technologies and clients. He focuses on building better and more sophisticated web applications, testing the limits of the browsers they run on, while at the same time trying to figure out how to make them easier to create and easier to maintain. When he's not working at his computer, he can be found hanging out with this kids and a good book.



31 January 2012

Also available in Chinese Russian Japanese Portuguese

Introduction

Prerequisites

To use mod_pagespeed, you need Apache version 2.2 running on Linux, either 32-bit or 64-bit, with the mod_deflate module installed.

Currently, mod_pagespeed doesn't support Windows or older versions of Apache (and if I had to guess based on their documentation and FAQ, I wouldn't look for those versions to be supported any time soon).

The mod_pagespeed Apache module from Google is designed to be a "drop-in" module that Apache web hosters can drop into their server installation, and instantly see a marked increase in page-serving speed. If you aren't sure how to speed up your website or haven't optimized your configuration for performance, this module could be the solution you're looking for. For a Wordpress blog owner using an out-of-the-box configuration and installation, for example, this module can make a huge difference.

The mod_pagespeed module combines the best practices for optimizing page load times in one module that automatically performs all the steps needed.

Speed, defined

No matter the speed of your Internet connection, you always want your web pages to load faster. The web servers that send you that page's content also want it to be faster, for two reasons: Companies with slow servers lose customers. And, companies that make an individual page load faster for you can free up their own servers to serve more pages for everyone else. While you may visit their page once a day, they are serving that same page to other customers 100,000 times a day. For those companies, even a small increase in speed is tremendous. .

For the purposes of this article, we'll define "speed" from the user's perspective: page load time. A web page that loads quickly, like www.google.com, is considered fast; its page load time is very low. Conversely, a hypothetical site with only one server running an old Pentium II 600 MHz and containing 40 huge image files, is considered slow; its page load time is very high. Additionally, a web page that is fast when only a few people are visiting it a day can become incredibly slow when a few thousand people visit the site in a 10-minute span. "Speed," therefore, is defined as the time it takes you to get from A to B, where A = when you press "Return" after typing something into the address bar, and B = when the page completely finishes loading in the browser.

From a web server's perspective, the best way to increase the speed (without changing hardware) is to send a smaller amount of data and to send it faster. In more detailed terms, you want to decrease the latency of the machine (latency = response time) and decrease the size of the data being transferred (the physical Kb of data being transferred). So, to "speed up" a website, from both the web server's perspective and the user's perspective, you would need to decrease the latency of a server AND decrease the size of the data being transferred.

How to increase speed

All the possible tactics to increase the speed of a website fall into 5 categories: maximize caching, minimize lookups, minimize per request overhead, minimize data size, and optimize browser rendering. Notice how each category describes an extreme action like "maximize" or "minimize"—there's no simple solution of "turning on" or "turning off" a feature or setting, for example. Let's break down each category and describe what it does.

Maximize caching

Browsers have the ability to cache files locally, so that instead of loading these files from the server for each page load, they'll use the local copy instead. While this doesn't make much sense for a dynamic web page itself (like a blog page), what about all the images, CSS files, and JavaScript files that are also loaded with the page? These files rarely, if ever, change. It's not necessary to go back to the server for the same CSS file over and over, especially if you use one CSS file for the entire site. Every page the user loads uses that same CSS file, so you might as well use the locally cached copy instead of the server's copy—they're identical.

Browsers actually take cache commands from the server. The server can tell the browser whether, and for how long, the items are valid. The servers can attach these commands to each file. Thus, a server call tell a browser to cache a JS file, a CSS file, a JPG file, but NOT an HTML file, or a TXT file. Further, it can tell the browser, "the CSS and JPG files don't expire for one year." The browser interprets these commands as the server telling it "here's the JPG file; it won't change for a year, and between now and then, you should use your local copy every time there's a reference to it on any page on my server."

Minimize lookups

When a file of any type is sent from the server to the browser, there is associated overhead with establishing the HTTP connection on each file. In other words, a communication channel needs to be established between the server and the browser for each file. On most common websites, dozens of files need to be transferred, and each time this overhead is required, the server takes a certain amount of time to create it. Reducing the number of lookups therefore reduces the amount of time a server has to spend creating them.

The primary way to accomplish this goal should be obvious: reduce the number of files that need to be transferred on a webpage. For example, if you have a page with 10 images, you might need to have 10 HTTP connections created. But you can reduce the number of connections needed by combining resources into a minimal number of files, such as putting all the jQuery plugins into one file, or combining all the small images on a page into one big image file and then using CSS to show only the part of that large image you need to show.

Minimize per request overhead

When each webpage is loaded, the browser relays certain information to the server, including things like cookies and when POST and GET are called. This, of course, takes time, as the cookie information has to be uploaded to the server from the client for each page load. Some website store tons of information in the cookie, and the more information that needs to be uploaded, the longer it takes.

Instead, a more preferable design is to store as little information as possible in the cookie (such as just a userID) and use that as a key to find all the required information that you could store in a database instead. This could drastically reduce the upload time for this information.

Minimize data size

It's faster to send an image file that's 20Kb than an image file that's 200Kb. This category includes all those related practices: use a GIF instead of a JPG when possible, minimize your JavaScript files, and send thumbnail images instead of the entire image when you can, for example.

Optimize browser rendering

This category is challenging. It includes practices like using optimized CSS selectors, but often your site is already designed, and the person working with the CSS files is not the person optimizing server performance. Additionally, the factors in this category tend to offer the least bang-for-your-buck.

Enough with the theoretical stuff; let's download this product and get it running! After all, the promise of this plugin is to drop it in your own Apache install so you don't have to worry about all the details of increasing your speed.

Download, install, and configure mod_pagespeed

Download the package (see Resources below for a link) and run the appropriate Linux command to install it on your system.

Once the mod_pagespeed is installed on your Linux system, it's not tied into your Apache install yet. Let's do that. After the installation is complete, and to make things easier, I recommend you copy the file "pagespeed.conf" from where you installed it into Apache's "conf" directory. Also, copy the "mod_pagespeed.so" file into Apache's "modules" directory. Finally, create directories where you want to store the cache and the files that the module will generate, which I'll call "cache" and "files".

Next, we need to tell Apache to use the mod_pagespeed module, and to do that we need to edit the http.conf file. Add the following line to the bottom of the file:

Listing 1. Modify Apache's http.conf file
Include "{your-path-to-this-file}/pagespeed.conf"

Next, we need to modify the pagespeed.conf slightly to point to the correct path for our files and directories:

Listing 2. Modify pagespeed.conf file
# at line 1 of the file
LoadModule pagespeed_module {your-path-to-this-file}/mod_pagespeed.so

# down at line 25-26 in the file
ModPagespeedFileCachePath "{your-'cache'-file-path-here}"
ModPagespeedGeneratedFilePrefix "{your-'files'-file-path-here}"

Finally, start up your Apache server.


Test your installation of mod_pagespeed

Make sure your installation of mod_pagespeed is working by checking a few things. Obviously, make sure your Apache started up properly (and if it didn't, you didn't install/configure mod_pagespeed correctly). Second, check the "cache" folder to see that mod_pagespeed is writing there correctly.

Now visit a web page of the site you're working on (the page should contain things like JS and CSS files). After you visit your page, check the "cache" folder. You should see files there, which will be GZipped/Compressed versions of the files that can be shrunk (the JS and CSS files). That's your first indication that you've installed and configured mod_pagespeed correctly.

The final check is the response header you get back from the server. You can check this using a tool like Firebug or Google's own Page Speed, or even writing PHP code to look for it. When you check the response header, look for a reference to "Modpagespeed," since Google's tool tags the response.

Congratulations! A successful check of the "cache" folder and the response header indicate that you've installed and configured mod_pagespeed correctly.


Experiment with mod_pagespeed

If you want to play with mod_pagespeed before you commit to it, you can follow the steps above, and turn off the module in the pagespeed.conf file until you're ready to activate it. (You don't need to undo the configuration settings in your Apache's http.conf file.) You can simply turn mod_pagespeed off and on in the pagespeed.conf file itself; look for the parameter ModPagespeed on at line 10. You need to bounce, or restart, Apache for this change to take effect.


Conclusion

Most webmasters understand the importance of speed and page load times, yet few are dedicated Linux geeks who are monitoring statistics and max load times. For most webmasters, a simple tool like mod_pagespeed is ideal—easy installation, almost no configuration necessary, and greatly improved page load times for your users.

Resources

Learn

Get products and technologies

  • Download the mod_pagespeed module from Google.
  • Evaluate IBM products in the way that suits you best: Download a product trial, try a product online, use a product in a cloud environment, or spend a few hours in the SOA Sandbox learning how to implement Service Oriented Architecture efficiently.

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Linux on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Linux
ArticleID=790733
ArticleTitle=Pure speed with mod_pagespeed
publish-date=01312012