The following blog post contains material either currently found or soon to be incorporated into my new book, "Easy Active Record for Rails Developers". Among many other topics, you'll learn about model generation, migrations, validations, associations, scopes, joins, includes, forms integration, nested forms, and model testing with RSpec and FactoryGirl. The book is now available, head over to this website's home page to learn more.


There’s a matter known as the “N + 1 Queries” problem that has long confused beginning Rails developers to the detriment of their application’s performance. To understand the nature of the issue, let’s consider an example. Suppose your application aggregates information about local arcades, and you’d like to create a listing that outputs the name and state (state in terms of location, as in United States state) of each arcade. In conforming to the standard norms of database normalization, the list of valid states are stored in a table named states, and the locations table identifies each arcade’s state using the foreign key state_id. With this in mind, consider the following seemingly innocent query in which we retrieve ten arcade locations:

@locations = Location.limit 10

In the corresponding application view you then iterate over the retrieved locations like so:

<ul>
  <% @locations.each do |location| %>
    <li><%= location.name %> (<%= location.state.name %>)</li>
  <% end %>
</ul>

Pretty innocent bit of code, right? It certainly seems so until you realize these two snippets result in the execution of 11 separate queries! Thus the name “N + 1”, because we’re executing one query to retrieve the ten locations, and then 10 queries to retrieve the name of each location’s state name! To really drive this point home, here’s the eleven queries that executed when I ran the above two snippets:

SELECT  `locations`.* FROM `locations`  LIMIT 10
State Load (0.2ms)  SELECT  `states`.* FROM `states`
  WHERE `states`.`id` = 36 LIMIT 1
State Load (0.2ms)  SELECT  `states`.* FROM `states`
  WHERE `states`.`id` = 36 LIMIT 1
State Load (0.2ms)  SELECT  `states`.* FROM `states`
  WHERE `states`.`id` = 36 LIMIT 1
State Load (0.2ms)  SELECT  `states`.* FROM `states`
  WHERE `states`.`id` = 36 LIMIT 1
State Load (0.3ms)  SELECT  `states`.* FROM `states`
  WHERE `states`.`id` = 36 LIMIT 1
State Load (0.2ms)  SELECT  `states`.* FROM `states`
  WHERE `states`.`id` = 36 LIMIT 1
State Load (0.1ms)  SELECT  `states`.* FROM `states`
  WHERE `states`.`id` = 36 LIMIT 1
State Load (0.1ms)  SELECT  `states`.* FROM `states`
  WHERE `states`.`id` = 36 LIMIT 1
State Load (0.1ms)  SELECT  `states`.* FROM `states`
  WHERE `states`.`id` = 36 LIMIT 1
State Load (0.4ms)  SELECT  `states`.* FROM `states`
  WHERE `states`.`id` = 9 LIMIT 1

Rails newcomers often believe explicitly joining the tables will solve the problem:

locations = Location.joins(:state).limit 10
=> [#<Location id: 1, name: "Ethyl & Tank", description: ...
 #<Location id: 10, name: "Atlas Arcade", ...
 ]

It however does not, because joins will also forego eager loading:

>> locations.first.state
State Load (0.2ms)  SELECT  `states`.* FROM `states`  WHERE `states`.`id` = 36 LIMIT 1
=> #<State id: 36, name: "Ohio", ...

Instead, if you know you’re going to want to retrieve associated data, you can eager load the data using the includes method:

@locations = Location.includes(:state).limit 10

This produces just two queries, even after iterating over all ten locations just as we did before:

SELECT  `locations`.* FROM `locations`  LIMIT 10
State Load (4.1ms)  SELECT `states`.* FROM `states`  WHERE `states`.`id` IN (36, 9)

This time, only two queries are executed. First, the ten locations are retrieved. Next, each location’s state record is retrieved by passing the list of state IDs into the second query. The state information is then attached to each location record, eliminating the need to perform the additional queries!

Comments