Skip to main content

Real world Rails: Caching in Rails

Different caching strategies for production Rails applications

Bruce Tate (bruce@rapidred.com), CTO, WellGood LLC
Bruce Tate is a father, mountain biker, and kayaker in Austin, Texas. The CTO of WellGood, LLC and the chief architect behind ChangingThePresent.org, he's also the author of nine books, including Beyond Java, From Java to Ruby, and Ruby on Rails: Up and Running. He spent 13 years at IBM and later formed the RapidRed consultancy, where he specialized in lightweight development strategies and architectures based on Ruby, and in the Ruby on Rails framework. He now works with a team of Rails developers to build and maintain the charity portal, ChangingThePresent.org.

Summary:  Ruby on Rails is increasingly showing up as the base framework for sophisticated and scalable applications of medium and large size. Because Ruby is an interpreted language, to bend Rails to your will, you will need to employ many different caching strategies. This article explores the caching strategies that are available to you, including the ones we use for ChangingThePresent.org.

View more content in this series

Date:  15 May 2007
Level:  Intermediate
Activity:  8790 views

About this series

Rails has an unmistakable reputation among developers. Rails represents buzz and sizzle; extreme productivity, and controversy. Depending on who you ask, Rails is hyper productive or a toy, well marketed or over hyped. Like many newer technologies, Rails also has a reputation as unproven with limited scalability. Unlike the C and Java™ languages, Ruby is interpreted, with all of the inherent performance handicaps.

In reality, many of the largest sites on the Internet use interpreted languages. They use the same strategies that Ruby uses: clustered, shared-nothing architectures. And they cache. To get the best possible performance, most sites will need an effective caching strategy. Rails developers are beginning to follow suit.

In the Real world Rails series, international author and speaker Bruce Tate takes you through real-world Rails development, from the inside. As the CTO for WellGood, LLC, he is responsible for designing, building, and maintaining ChangingThePresent.org, a charity donations portal where you can donate an hour of a cancer researcher's time, preserve an acre of the rain forest, or sponsor the cataract surgery to make a blind person see. Hundreds of thousands of users have found thousands of nonprofits on ChangingThePresent to date, and the site continues to grow in popularity and scale.

You can find dozens of articles that will help you build simple Rails applications. This series will take you beyond the basics of building a simple blog, and into the issues that every Rails site must solve. You'll learn how to optimize Rails, and how to make your site more stable. You'll also learn how to work around base limitations of Rails through adding plug-ins. After reading each article in the series, you'll know a little more about how to make Rails sites work in the real world.

A few scenarios

First, allow me to walk through a few pages of ChangingThePresent.org. I'll show you a few places on the site where I'm likely to need a cache. Then, I'll point out the choice we made for each, and the code, or the strategy, that we use to implement these pages. In particular, I'll talk about what we do for all of the following:

  • Full static pages
  • Dynamic full pages that rarely change
  • Dynamic page fragments
  • Application data

Consider static pages. Just about every site has static pages, such as the one in Figure 1, which has our terms and conditions. You'd get to the page by clicking register, and then user agreement. For ChangingThePresent, we removed all dynamic content from the page, so we could let Apache cache it. Because of the rules in our Apache configuration, that content is never served by the Rails servers. I don't consider this Rails caching at all.


Figure 1. User agreement
Figure 1

Next, consider full dynamic pages. Theoretically, ChangingThePresent could have pages that are dynamically built, but rarely change. Since nearly all pages show whether a user is logged in, this type of caching is not interesting to us.

Next, consider page fragment caching. Our home page, shown in Figure 2, used to be completely static. Now, a few elements are dynamic. Each day, the page shows a series of gifts, chosen both randomly and by our administrators. Notice the gifts in the section entitled "A Few of our Special Gifts for Mother's Day." Also, notice the link on the far right that says "login." This link depends on whether the user is logged in. We can't cache the full page. The page only changes once a day.


Figure 2. Home page

Finally, consider the application. Unless you did all your surfing fifteen years ago, the most interesting sites are dynamic. Modern applications have layers, and you can normally make them more efficient by adding caches between layers. ChangingThePresent does some caching with the database layer. Next, I'll drill into each of these types of caching, and discuss what we do for ChangingThePresent.

Caching static content

Mongrel is a Web server written by Zed Shaw in 2500 lines of Ruby and C. The little server with a tiny footprint is custom tailored for Ruby Web applications such as Rails, Nitro, Iowa, and so on. Mongrel runs on UNIX and Linux, but also on Win32. Mongrel is often run behind another Web server (like Apache or Litespeed) acting as a proxy, but this is not necessary -- since Mongrel is an HTTP server in its own right, you can use it with all of your favorite HTTP tools.

There's not too much to say about caching static data, aside from images. We are a donations portal, and that means we need to appeal to our user's emotional side. That means images, and later, video. But Mongrel, our Web server, does not serve static data particularly well, so we use Apache to serve image content.

We are moving to an image accelerator, Panther Express, to cache the most-used images, and move them closer to our customers. Using this strategy, we will have a subdomain, images.changingThePresent.org. Panther Express directly serves any images in their local caches, and then sends the requests to us. Since the Panther service does not know when we might change images, we expire them through HTTP headers, such as the following:


HTTP cache expiration header
                
HTTP/1.1 200 OK
Cache-Control: max-age=86400, must-revalidate
Expires: Tues, 17 Apr 2007 11:43:51 GMT
Last-Modified: Mon, 16 Apr 2007 11:43:51 GMT

Note that these are not HTML headers. They are built independently of your Web page content. Your Web server will build these HTTP headers for you. Since Web server configuration is not too interesting for a Rails article series, I'll move on to cache content that is controlled with the Rails framework (see Resources for a link to more information on Web server configuration).

Page caching

If you have dynamic pages that only occasionally change, you'll want to use page-level caching. Blogs and public bulletin boards are a couple of examples of this kind of application. Using page caching, you allow Rails to build a dynamic HTML page, and store it in a public directory, so your application server can serve it just as it would any other static page.

Since Rails never enters the picture if the page is cached, page caching is the fastest kind of caching in Rails. At its most basic level, page caching is actually very easy to do in Rails. Both page and fragment caching occur at the controller level. You need to tell Rails:

  • What pages would you like to cache?
  • How do you expire pages from the cache when the page's content changes?

You'll enable page caching using the caches_page directive within a controller class. For example, to cache the privacy_policy and user_agreement pages on your about_us_controller, you'd enter the following code:


Listing 2. Enabling page caching
                
class AboutController < ApplicationController
  caches_page :privacy_policy, :user_agreement 
end

You can expire pages with the expire_page directive. To expire the above pages when Rails invokes the new_pages action, I'd use the following code:


Listing 3. Expiring pages
                
class AboutController < ApplicationController
  caches_page :privacy_policy, :user_agreement 
  
  def new_pages
    expire_page :action => :privacy_policy
    expire_page :action => :user_agreement
  end
  
end

You'll need to watch for a few minor issues, such as URLs. Your URLs cannot depend on URL parameters. For example, instead of gifts/water?page=1, you need to use gifts/water/1. You can easily use such URLs in routes.rb. For example, our pages often have a tab parameter which shows which tab is selected. To make the tab part of the URL, we have the routing rule:


Listing 4. Routing rules for tabs
                
map.connect 'member/:id/:tab', :controller => 'profiles', :action => 'show'

You need to do the same for listings with page parameters, and other pages that depend on URL parameters. You also need to consider security.

Since the Rails framework is not involved if a page is in the cache, the server can't manage security for you. The Web server will happily render any page in the cache, whether the user has permission to see it or not. So don't use page caching when you care who can see a page.

That's just about all of the story when you're just caching simple static pages. As long as the content is reasonably simple, it's easy.

The trade-offs come when you try to cache more complicated content. As you cache pages that are substantially more dynamic, the expiration logic will get more complicated. To deal with complicated expiration policies, you'll write and configure custom sweepers. These classes delete selected elements from your cache when certain controller actions fire.

Most custom sweepers observe some model object, and based on changes, fire logic to expire one or more cache pages. Listing 5 shows a typical cache sweeper. In the sweeper, a developer can define an active record event, such as after_save. When that event fires, the sweeper will fire, and can invalidate selected pages in the cache. This example shows invalidation based on the expire_page method. Many serious applications directly use Ruby's excellent file system utilities to explicitly delete cached pages instead.


Listing 5. A typical observer
                
class CauseController < ApplicationController
   cache_sweeper :cause_sweeper
...

class CauseSweeper < ActionController::Caching::Sweeper
  observe Cause
    
  def after_save(record)
    expire_page(:controller => 'causes', :action => 'show', 
               :id => record.id)
    cause.nonprofits.each do |nonprofit|
     expire_page(:controller => 'nonprofits', :action => 'show', 
                  :id => nonprofit.id)
     end
   end
end

You're probably beginning to appreciate the down side of page caching: complexity. You can do page-level caching well, but the inherent complications will make your application harder to test, and increase the probability of bugs in the system. Also, if your pages will be different for each user, or you want to cache authenticated pages, you'll need to look beyond page caching. For ChangingThePresent, we have to deal with both circumstances, because we change links on our basic layout based on whether a user is logged in. We can't even consider page-level caching for most pages. I've linked to a couple of excellent articles on page-level caching in Resources so you can learn more. Next, I'll drill into action caching, which is another form of whole-page caching.

Action caching

You've learned both the primary strength, and the primary weakness, of page caching: For most page retrievals, Rails never enters the picture. The advantage is speed. The disadvantage is flexibility. If you need to do whole page caching based on conditions in the application — authentication, for example — you can use action caching instead.

Action caching works like page caching, but the flow is a little different. Rails will actually invoke the controller before rendering the action. If the page rendered by the action is already in the cache, Rails renders the page in the cache rather than rendering it again. Since Rails is now in the picture, it's slightly slower than page caching, but there's an upside. Almost all Rails authentication schemes use before filters on the controller. Action caching lets you take advantage of authentication and any filters on the controller.

Syntactically, action caching works exactly like page caching, but with a different directive. Listing 6 shows you how to use the caches_action directive.


Listing 6. Enabling action caching
                
    class AboutController < ApplicationController
      caches_action :secret_page, :secret_list 
    end

The cache expiration, and also the sweepers, work in an identical way. We don't use action caching for many of the same reasons we don't use page caching, but fragment caching is much more important for us.

Caching page fragments

Using partial caching, you can cache a portion of a page, often the content for a layout. With fragment caching, a developer identifies a fragment to cache by surrounding a block with rhtml directives placed directly on the Web page, as in listing 7. On ChangingThePresent.org, we cache the front page and several other pages using fragment caching. All of these pages have database-intensive accesses and are among our most popular pages.


Listing 7. Identifying cache fragments
                
<% cache 'gifts_index' do %>
    <h3>
      Here, you can make the world a better place with a single gift. Donation gifts 
      are also a wonderful way to honor friends and family. Just imagine what we
      can achieve together.
    </h3>
    <h2 class="lightBlue"><%= @event_title %></h2>
    <div id="homefeatureitems">
        <% for gift in @event_gifts %>
          <%= render :partial => 'gifts/listable', :locals => { :gift => gift } %>
        <% end %>
    </div>
    ...
<% end %>

In listing 7, the cache helper identifies the fragment to cache. The first parameter is a unique name identifying a cache fragment. The second parameter contains a code block — the code between the first do and the last end — that identifies exactly which RHTML fragment to cache.

Our site only has one home page, so naming the page is easy. In other places, we use a method that determines the URL for the page to uniquely identify the cache fragment. For example, when we cache the code for a cause such as world peace or poverty alleviation, we use the code in Listing 8. That code finds the permanent url, also called a permalink, for the cause.


Listing 8. Identifying a cache fragment by URL
                
<% cache @cause.permalink(params[:id]) do %>

Normally, when you cache individual pages, you need to expire them with sweepers. Sometimes, it's easier and cleaner to use simple timed-based expiration of objects. By default, Rails does not provide such a mechanism, but a plug-in called timed_fragment_cache does the trick. Using that plug-in, I can specify a timeout, either in the cached content or in the controller code that provides the dynamic data for the page. For example, Listing 9 shows the code that builds the dynamic data for the page having a list of causes. The when_fragment_expired method will execute only when the associated cache fragment expires. The method takes a parameter, specifying the length of the timeout, and a code block, specifying which content to rebuild when the content expires. I could have also chosen to specify the timeout within the rhtml page along with the cache method, but we prefer the controller-based method.


Listing 9. Time-based cache expiration
                
def index
  when_fragment_expired 'causes_list', 15.minutes.from_now do 
    @causes = Cause.find_all_ordered
  end
end

Using a timed expiration technique, you can dramatically simplify your caching strategy, if you can afford to have data that is slightly stale. For each cached element, you need only specify the content you want to cache, any controller action that produces your dynamic content, and a timeout. Just as you would with page caching, if you need to, you can also explicitly expire content, using the method expire_fragment :controller => controller, :action => action, :id => id. This method works just like the expiration of cached actions and pages. Next, I'll show you how to configure the back end.

Memcached

So far, I've discussed page and fragment caching models for Ruby on Rails. Now that you've seen the API, it's time to define where the cached data will go. By default, Rails will place cached pages in the file system. Both cached pages and actions go into the public directory. You can configure the storage location for cached fragments. You can use a memory store, the file system (at a directory you specify), the database, or a service called memcached. For ChangingThePresent.org, we use memcached.

Think of Memcached as a huge hash map that you can reach over the network. Memory-based caching is fast, and network-based caches are scalable. With plug-in support, Rails can use memcached to cache fragments and ActiveRecord models. To use it, you install memcached (see Resources for details), and configure it in environment.rb (or one of the environment configuration files such as production.rb.)


Listing 10. Configuring caching
                
config.action_controller.perform_caching = true

memcache_options = {
  :c_threshold => 10_000,
  :compression => false,
  :debug => false,
  :readonly => false,
  :urlencode => false,
  :ttl => 300,
  :namespace => 'igprod',
  :disabled => false
}

CACHE = MemCache.new memcache_options


Listing 10 shows a typical configuration. The first line, config.action_controller.perform_caching = true, turns caching on. The next line prepares the caching options. Notice that a variety of options allow you to get more debugging data, disable the cache, and to define the namespace of the cache. You can find more about the configuration options at the memcached site in the Resources section.

Caching models

The final form of caching we use is model-based caching. We use a customized version of the caching plug-in called CachedModel. Model caching is a limited form of database cache. The cache is easy to enable, on a per-model basis.

To make a model use the caching solution, you simply extend the CachedModel class instead of extending ActiveRecord, as in Listing 11. CachedModel extends ActiveRecord::Base. ActiveRecord is not a full object relational mapping layer. The framework relies heavily on SQL to perform complex features, and the user can easily drop down into SQL as desired. Directly using SQL makes caching problematic, since the caching layer must deal with full result sets rather than single database rows. Handling full result sets is problematic at best, and near impossible without deep supporting application logic. For this reason, CachedModel's focus is strictly on caching single model objects, and accelerates only queries that return a single row.


Listing 11. Using CachedModel
                
Class Cause < CachedModel

Most Rails applications repetitively access several items, such as user objects. Model caching can really speed things up in those circumstances. For ChangingThePresent, we're only now ramping up our model-based caching.

Wrapping up

While Ruby is a wonderfully productive language, the interpreted nature of the language is not ideal from a performance perspective. Most major Rails applications will mitigate some of the damage through effective use of caching. For ChangingThePresent.org, we primarily use fragment caching, and we invalidate caching fragments from the controller, primarily using a time-based method. This approach works well for us, even though we have pages that change based on the logged in user.

We're also studying the impact of using memcached-backed CachedModel classes. We've only begun to study the impact of such caching on our database performance, but early results are promising. In the next article, I'll write about some tricks that we use to do database optimization for another example of real world Rails.


Resources

Learn

  • Java To Ruby: Things Your Manager Should Know (Pragmatic Bookshelf, 2006): The author's book about when and where it makes sense to make a switch from Java programming to Ruby on Rails, and how to make it.

  • Changing The Present: The nonprofit marketplace where you can give a donation gift consisting of an acre of a rain forest, sight for a blind man, or an hour of a cancer researcher's time. This site serves as a foundation for this article series.

  • "Rolling with Ruby on Rails" and Learn all about Ruby on Rails: Learn more about Ruby and Rails, including installation procedures.

  • Active Record: Active Record is the persistence framework for the Ruby on Rails framework.

  • Rails Caching: The Robert Evans overview of caching models in Rails covering page caching, action caching, and fragment caching.

  • Ruby on Rails Caching Tutorial: This excellent caching tutorial by Gregg Pollack walks through page caching, and includes a good overview of cache sweeping strategies.

Get products and technologies

  • Ruby on Rails: Download the open source Ruby on Rails Web framework.

  • Mongrel: The production-strength application server that runs many of the top Rails sites. We use Mongrel at ChangingThePresent.org.

  • Apache Web server: The Web server that many Rails sites use to serve static content, including cached content.

  • Panther Express: The image accelerator that ChangingThePresent.org will use to cache image content.

  • Timed-cache expiration plug-in: Richard Livsey's plug-in to handle timed expiration of cache fragments. ChangingThePresent.org uses this plug-in to cache our home page and other major cache fragments.

  • Memcached: A networked service that provides a distributed objects cache. Memcached serves as a back end for ChangingThePresent's caching services.

  • CachedModel: A memcached-backed caching service that serves as a backing to ActiveRecord objects. CachedModel accelerates only database queries with single-row results.

About the author

Bruce Tate is a father, mountain biker, and kayaker in Austin, Texas. The CTO of WellGood, LLC and the chief architect behind ChangingThePresent.org, he's also the author of nine books, including Beyond Java, From Java to Ruby, and Ruby on Rails: Up and Running. He spent 13 years at IBM and later formed the RapidRed consultancy, where he specialized in lightweight development strategies and architectures based on Ruby, and in the Ruby on Rails framework. He now works with a team of Rails developers to build and maintain the charity portal, ChangingThePresent.org.

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Web development
ArticleID=220054
ArticleTitle=Real world Rails: Caching in Rails
publish-date=05152007
author1-email=bruce@rapidred.com
author1-email-cc=bruce.tate@j2life.com

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Rate a product. Write a review.

Special offers