Ruby on Rails programming will spoil you. The growing framework will spare you from the tedium that's rampant in other frameworks. You'll express your ideas in a fraction of the lines of code that you're used to writing. And you'll get to use ActiveRecord.
As a long-time Java™ programmer, ActiveRecord was somewhat alien to me. With Java frameworks, I would normally build a map between independent models and schemas. Frameworks like this are mapping frameworks. With ActiveRecord, I define only the database schema, either in SQL or in a Ruby class called a migration. Frameworks that base the object model design on the structure of the database are called wrapping frameworks. But unlike most wrapping frameworks, Rails can discover the features of the object model by querying the database table. Instead of building complex queries, I can use the model to traverse relationships in Ruby instead of SQL. I get the simplicity of a wrapping framework with much of the power of a mapping framework. ActiveRecord is easy to use and easy to extend. Sometimes, it's too easy.
Like any database framework, ActiveRecord lets me do many things that get me into trouble. I can fetch too many columns, or leave off important structural database features, like indexes or null constraints. I'm not saying that ActiveRecord is a bad framework. You just need to know how to harden your application if you need it to scale. In this article, I'll walk you through some of the important optimizations that you might need with the Rails unorthodox persistence framework.
Generating a schema-backed model is as easy as generating a little code, with script/generate model model_name. As you know, that command generates your model, migration, unit test, and even a default fixture. It's tempting to fill in a few data columns in the migration, and enter a little test data, write a few tests, add a few validations, and call it done. But be careful. You should also consider your overall database design. Keep these things in mind:
- Rails will not insulate you from basic database performance issues. Your database needs information, often in the form of indexes, to perform well.
- Rails will not insulate you from data integrity problems. Though most Rails developers do not like to keep constraints in the database, you should consider things like nullable columns.
- Rails has convenient defaults for many elements. Sometimes, default attributes like the length of text fields are too large for most practical applications.
- Rails will not force you to create an efficient database design.
Before you trudge ahead and dive deep into ActiveRecord, you should make sure that you
have a strong foundation. Make sure that your index structure will work for you. If a
given table will be large, if you'll be searching on a column other than id, and if an index will help you (see your database manager
documentation for details -- different databases use indexes in different ways), make
sure you create your index. You need not drop down into SQL to create an index -- you
can simply use a migration. You can easily create an index with your create_table migration, or create an additional migration that
creates your index. Here's an example of a migration that creates an index that we use
for ChangingThePresent.org (see Resources):
Listing 1. Creating an index in a migration
class AddIndexesToUsers < ActiveRecord::Migration
def self.up
add_index :members, :login
add_index :members, :email
add_index :members, :first_name
add_index :members, :last_name
end
def self.down
remove_index :members, :login
remove_index :members, :email
remove_index :members, :first_name
remove_index :members, :last_name
end
end
|
ActiveRecord will take care of the index on id, so I explicitly add indexes that I use in various searches, because the table is large, infrequently updated, and frequently searched. Often, we will wait until we measure a problem in a given query before we take action. This strategy keeps us from second-guessing the database engine. But in the case of users, we know that the table will quickly grow into the millions of users, and will be ineffective without an index on frequently searched columns.
Two more common problems also relate to migrations. If you have strings and columns
that should not be null, make sure you code your migration appropriately. Most DBAs
(database administrators) might think that Rails has the wrong default for null
columns: By default, columns can be null. If you want to create a column that cannot be null, you must explicitly add the parameter :null => false. And if you have a string column, make sure that you code an appropriate limit. By default, Rails migrations will encode a string column as a varchar(255). Normally, that's too large. You should do your best to maintain a database structure that reflects your application. Rather than have an unrestricted login, if your application limits logins to 10 characters, you should code your database appropriately, as in Listing 2:
Listing 2. Coding migrations with limits and non-nullable columns
t.column :login, :string, :limit => 10, :null => false
|
You should also consider default values, and any other information that you can safely provide. With a little up-front work, you can save yourself a whole lot of time chasing data integrity problems later. While you are considering your database basics, also think about which pages are static, and thus easy to cache. Given a choice between optimizing a query and caching a page, caching the page will give you a far greater return, if you can stomach the complexity. Sometimes, pages or fragments are purely static, such as a list of states, or frequently asked questions. In those cases, caching is a slam dunk. Other times, you may decide to limit your complexity, and attack database performance instead. For ChangingThePresent, we've done both, depending on the problem and the circumstance. If you should decide to attack query performance, read on.
By default, ActiveRecord relationships are lazy. That means the framework will wait to access a relationship until you actually access it. Take, for example, a member that has an address. You can open the console and type the command: member = Member.find 1. You'll see the following appended to your log, as in Listing 3:
Listing 3. Log from Member.find(1)
^[[4;35;1mMember Columns (0.006198)^[[0m ^[[0mSHOW FIELDS FROM members^[[0m
^[[4;36;1mMember Load (0.002835)^[[0m ^[[0;1mSELECT * FROM members WHERE
(members.`id` = 1) ^[[0m
|
Member has a relationship to an address, that was defined with the macro
has_one :address, :as => :addressable, :dependent =>
:destroy. Notice that you don't see an address field in the log
when ActiveRecord loaded Member. But if you
type member.address in the console, you'll see
the contents of Listing 4 in development.log:
Listing 4. Accessing a relation forces a database access
^[[36;2m./vendor/plugins/paginating_find/lib/paginating_find.rb:98:in `find'^[[0m
^[[4;35;1mAddress Load (0.252084)^[[0m ^[[0mSELECT * FROM addresses WHERE
(addresses.addressable_id = 1 AND addresses.addressable_type = 'Member') LIMIT 1^[[0m
^[[35;2m./vendor/plugins/paginating_find/lib/paginating_find.rb:98:in `find'^[[0m
|
So ActiveRecord does not execute the query for the address relationship until you actually access member.address. Normally, this lazy design works well, because the persistence framework does not need to move as much data to load a member. But assume you wanted to access a list of members, and all of their addresses, as in Listing 5:
Listing 5. Retrieving multiple members with addresses
Member.find([1,2,3]).each {|member| puts member.address.city}
|
Since you should see a query for each of the addresses, the results will not be pretty, in terms of performance. Listing 6 tells the story:
Listing 6. Queries for the N+1 problem
^[[4;36;1mMember Load (0.004063)^[[0m ^[[0;1mSELECT * FROM members WHERE
(members.`id` IN (1,2,3)) ^[[0m
^[[36;2m./vendor/plugins/paginating_find/lib/paginating_find.rb:98:in `find'^[[0m
^[[4;35;1mAddress Load (0.000989)^[[0m ^[[0mSELECT * FROM addresses WHERE
(addresses.addressable_id = 1 AND addresses.addressable_type = 'Member') LIMIT 1^[[0m
^[[35;2m./vendor/plugins/paginating_find/lib/paginating_find.rb:98:in `find'^[[0m
^[[4;36;1mAddress Columns (0.073840)^[[0m ^[[0;1mSHOW FIELDS FROM addresses^[[0m
^[[4;35;1mAddress Load (0.002012)^[[0m ^[[0mSELECT * FROM addresses WHERE
(addresses.addressable_id = 2 AND addresses.addressable_type = 'Member') LIMIT 1^[[0m
^[[35;2m./vendor/plugins/paginating_find/lib/paginating_find.rb:98:in `find'^[[0m
^[[4;36;1mAddress Load (0.000792)^[[0m ^[[0;1mSELECT * FROM addresses WHERE
(addresses.addressable_id = 3 AND addresses.addressable_type = 'Member') LIMIT 1^[[0m
^[[36;2m./vendor/plugins/paginating_find/lib/paginating_find.rb:98:in `find'^[[0m
|
The results are as ugly as I promised they would be. You get one query for all of the
members, and another for each of the addresses. We retrieved three
members, and you got four queries. N members; N+1 queries. This problem is
the dreaded N+1 problem. Most persistence frameworks solve this problem
with eager associations. Rails is no exception. If you know that you will
need to access a relationship, you can opt to include it with your initial
query. ActiveRecord uses the :includeoption for
this purpose. If you changed the query to Member.find([1,2,3], :include => :address).each {|member| puts member.address.city}, you'll get a much better result:
Listing 7. Solving the N+1 problem
^[[4;35;1mMember Load Including Associations (0.004458)^[[0m ^[
[0mSELECT members.`id` AS t0_r0, members.`type` AS t0_r1,
members.`about_me` AS t0_r2, members.`about_philanthropy`
...
addresses.`id` AS t1_r0, addresses.`address1` AS t1_r1,
addresses.`address2` AS t1_r2, addresses.`city` AS t1_r3,
...
addresses.`addressable_id` AS t1_r8 FROM members
LEFT OUTER JOIN addresses ON addresses.addressable_id
= members.id AND addresses.addressable_type =
'Member' WHERE (members.`id` IN (1,2,3)) ^[
[0m
^[[35;2m./vendor/plugins/paginating_find/lib/paginating_find.rb:
98:in `find'^[[0m
|
That query will be much faster. You see one query that retrieves all of the members and addresses. That's how eager associations work.
With ActiveRecord, you can also nest the :include
option, but only one level deep. For example, consider a Member that has many contacts, and a Contact that has one address. If you wanted to show all of the cities for a member's contacts, you could use the code in Listing 8:
Listing 8: Getting the cities for a member's contacts
member = Member.find(1)
member.contacts.each {|contact| puts contact.address.city}
|
That code would work, but you'd have to query for the member, each contact, and each contact's address. You can improve the performance a little by eagerly including :contacts with :include => :contacts. You can do better by including both, as in Listing 9:
Listing 9: Getting the cities for a member's contacts
member = Member.find(1)
member.contacts.each {|contact| puts contact.address.city}
|
You can do much better by using a nested include option:
member = Member.find(1, :include => {:contacts => :address})
member.contacts.each {|contact| puts contact.address.city}
|
That nested include tells Rails to eagerly include both the contacts and address relationships. You
can use the eager loading technique whenever you know that you will use relationships in
a given query. That technique is the performance optimization technique that we use for
ChangingThePresent.org most frequently, but it does have limitations. When you have to
join across more than two tables, you're better off dropping into SQL. If you need to do
reporting, you're almost always better off simply grabbing the database connection and
bypassing ActiveRecord altogether with ActiveRecord::Base.execute("SELECT * FROM...").
Generally, eager associations will be more than enough. Now I'll shift gears and look at another bugaboo for Rails developers: inheritance.
When most Rails developers encounter Rails for the first time, they are captivated. It's just so easy. You simply create a type column on your database table and inherit any subclass from the parent. Rails will take care of the rest. For example, you might have a table called Customer that inherits from a class called Person. A customer has all of the columns of Person, plus a loyalty number and an order history. Listing 10 shows the beauty in the simplicity of the solution. The master table has all of the columns of the parent and all subclasses.
Listing 10. Implementing inheritance
create_table "people" do |t|
t.column "type", :string
t.column "first_name", :string
t.column "last_name", :string
t.column "loyalty_number", :string
end
class Person < ActiveRecord::Base
end
class Customer < Person
has_many :orders
end
|
Such a solution works well in most ways. The code is simple, and not repetitive. The queries are simple with good performance because you need not do any joins to access multiple subclasses, and ActiveRecord can use the type column to determine which records to return.
In some ways, though, ActiveRecord inheritance is fairly limited. If you have an inheritance hierarchy that is too broad, inheritance will break down. For example, at ChangingThePresent, we have several types of content that each have names, short and long descriptions, some common presentation attributes, and several custom attributes. We would like causes, nonprofits, gifts, members, drives, registries, and many other types of objects to inherit from a common base class so we could treat all types of content in the same way. We can't, because the Rails model would have the meat of our object model in a single table, and that's not a realistic solution.
We experimented with three solutions to this problem. One, we would have each proper class in its own table, and use views to build a common table for content. We threw this solution out early, because Rails does not deal with database views very well.
Our second solution was to use simple polymorphism. With this strategy, each proper
subclass has its own table. We push common columns down into each table. For example, say I need a superclass called Content that has only a name property with Gift, Cause, and Nonprofit subclasses. Gift, Nonprofit, and Cause would all have a name property. Since Ruby is dynamically typed, they need not inherit from a common base class. They need only respond to the same set of methods. ChangingThePresent uses polymorphism in several places to provide common behavior, particularly when we are dealing with images.
The third alternative is to provide a common capability, but using associations
instead of inheritance. ActiveRecord has a feature called polymorphic associations that is ideal for attaching common behavior to a class without inheritance. You saw an example of a polymorphic association earlier in Address. I can use the same technique to attach my common attributes for content management instead of inheritance. Consider a class called ContentBase. Normally, to associate that class to another class, you would use a has_one relationship and a simple foreign key. But you would probably want your ContentBase to work with more than one class. You need a foreign key, and also a column that defines the type of the target class. That's exactly how ActiveRecord polymorphic associations work. Look at the classes in Listing 11.
Listing 11. Both sides of a relationship for site content
class Cause < ActiveRecord::Base
has_one :content_base, :as => :displayable, :dependent => :destroy
...
end
class Nonprofit < ActiveRecord::Base
has_one :content_base, :as => :displayable, :dependent => :destroy
...
end
class ContentBase < ActiveRecord::Base
belongs_to :displayable, :polymorphic => true
end
|
Normally, a belongs_to relationship is with only one class, but the relationship in ContentBase is polymorphic. The foreign key has not only an identifier to identify a record, but also a type to identify the table. Using this technique, I get most of the benefits of inheritance. The common functionality is all in a single class. But I also get a few side benefits. I don't have to have all of the columns in Cause and Nonprofit all in a single table.
Some database administrators do not like polymorphic associations because they do not use true foreign keys, but for ChangingThePresent, we freely use them. In truth, the data model is not as beautiful in theory as it might be. You can't use database features such as referential integrity, and you can't rely on tools to discover the relationships based on the names of your columns. The advantages of a clean and simple object model, to us, outweigh the problems of the approach.
create_table "content_bases", :force => true do |t|
t.column "short_description", :string
...
t.column "displayable_type", :string
t.column "displayable_id", :integer
end
|
ActiveRecord is a perfectly capable persistence framework. You can build scalable, reliable systems with it, but as with any database framework, you have to pay attention to the SQL that your framework generates. When you have the occasional problem, you have to adjust your approach. Keeping up with your indexes, using eager loading with include, and using polymorphic associations in places instead of inheritance are just three ways you can use to improve your code base. Next month, I'll walk you through another example of how we write real world Rails.
Learn
- See
all the articles in the Real World Rails series.
-
Java To Ruby: Things Your Manager Should Know
(Pragmatic Bookshelf, 2006): The author's book about when and where it makes sense to make a switch from Java programming to Ruby on Rails, and how to make it.
-
Changing The Present: The nonprofit marketplace where you can give a donation gift consisting of an acre of a rain forest, sight for a blind man, or an hour of a cancer researcher's time. This site serves as a foundation for this article series.
-
"Rolling with Ruby on Rails" and Learn all about Ruby on Rails: Learn more about Ruby and Rails, including installation procedures.
-
ActiveRecord: ActiveRecord is the persistence framework for the Ruby on Rails framework.
-
Understanding Polymorphic Associations: This page in the Rails wiki shows how to use polymorphic associations.
-
Common performance problems with
Ruby on Rails shows common performance problems with Ruby on Rails, and explores one technique that I addressed: eager associations. It also addresses a technique I do not use called piggy backed attributes.
Get products and technologies
-
Ruby on Rails: Download the open source Ruby on Rails Web framework.

Bruce Tate is a father, mountain biker, and kayaker in Austin, Texas. The CTO of WellGood, LLC and the chief architect behind ChangingThePresent.org, he's also the author of nine books, including Beyond Java, From Java to Ruby, and Ruby on Rails: Up and Running. He spent 13 years at IBM and later formed the RapidRed consultancy, where he specialized in lightweight development strategies and architectures based on Ruby, and in the Ruby on Rails framework. He now works with a team of Rails developers to build and maintain the charity portal, ChangingThePresent.org.





