The following blog post contains material either currently found or soon to be incorporated into my new book, "Easy Active Record for Rails Developers". Among many other topics, you'll learn about model generation, migrations, validations, associations, scopes, joins, includes, forms integration, nested forms, and model testing with RSpec and FactoryGirl. The book is now available, head over to this website's home page to learn more.

Suppose you wanted to highlight an “Arcade of the Day” section on the ArcadeNomad home page. You could do so by randomly retrieving a record from the games table. While Active Record does not offer a convenience method for retrieving a random record, there are a number of easy approaches one could employ to get the job done. One way involves using MySQL’s rand() function in conjunction with order, as described in the MySQL documentation:

>>'id, name').order("RAND()").first
  Location Load (0.3ms)  SELECT id, name FROM `locations` ORDER BY RAND() LIMIT 1
=> #<Location id: 6, name: "Pizza Works">
>>'id, name').order("RAND()").first
  Location Load (0.3ms)  SELECT id, name FROM `locations` ORDER BY RAND() LIMIT 1
=> #<Location id: 5, name: "BW3s Dublin">

This works as desired (although not really for large tables, as I’ll cover later in this post), so what’s the problem? As it happens, rand() is a MySQL-specific function, meaning if you attempted to use this example while running ArcadeNomad atop PostgreSQL or SQLite, the query would fail! Of course, such approaches don’t present a problem if you don’t plan on ever migrating to another database solution (although it would fail if you for instance happened to use SQLite for your test database but MySQL for your development database), but even so you can avoid any issues that might arise from such unforeseen eventualities by taking advantage of some of the other database-agnostic methods already introduced:

>> rand_id = rand(Location.count)
  (0.1ms)  SELECT COUNT(*) FROM `locations` 
=> 6493708
>> Location.first(:conditions => [ 'id >= ?', rand_id])
  (3.5ms)  SELECT COUNT(*) FROM `locations`
 Location Load (0.6ms)  SELECT `locations`.* FROM `locations` ORDER BY `locations`.`id` ASC LIMIT 1 OFFSET 2

Notice how in this example not one but two queries are actually executed in order to achieve the desired outcome. The first query determines how many records are found in the locations table. This value is passed into Ruby’s rand() function, which will use the value as the upper boundary for choosing a random number between zero and this boundary. Therefore if the locations table currently contains 16 rows, then rand() would return a number between 0 and 16 (both numbers inclusive).

But your table probably doesn’t consist of only 16 rows; more likely, it consists of thousands, hundreds of thousands, or millions of rows. So if you know you’re not going to switch databases, perhaps it makes more sense to use MySQL’s native rand() function after all? Let’s benchmark both approaches using Ruby’s Benchmark module (running on my new MacBook Pro). This benchmark was performed on a table containing just over 9.6 million rows: do |x|"native:") { 10.times {|i| native_rand } }"AR:") { 10.times {|i| ar_rand } }

def native_rand
  Location.first(:order => "RAND()")

def ar_rand
  rand_id = rand(Location.count)
  rand_record = Location.first(:conditions => [ 'id >= ?', rand_id])
            user            system       total         real
native:  0.170000   0.250000   0.420000 (155.814105)
AR:       0.090000   0.040000   0.130000 (  0.159229)

The real time is the one we care about, because it identifies the total amount of time in seconds required to execute each approach. That’s right, perhaps surprisingly, the Active Record approach is far more efficient! As it turns out, ordering results randomly in MySQL is a pretty bad idea, because of the particular way in which MySQL determines which record should be “randomly” selected.

Like what you read? There’s plenty more where this came from in my new book, “Easy Active Record for Rails Developers”!