#239 ActiveRecord::Relation Walkthrough
One of Rails 3’s best new features in is the new Active Record query syntax. Episode 202 [watch, read] covered this new syntax in some detail so if you’re not yet familiar with it it’s well worth taking a look at that episode before reading this one. When you first use the new syntax it might appear that some magic is going on behind the scenes but here we’ll take you on a tour of the relevant parts of the Rails source code to show you exactly how it works.
Getting The Source
If you don’t have a copy of the Rails source code to hand then it’s worth getting a copy so that you can refer to it as you read this episode. All you need to do is clone the git repository from Github with the following command.
$ git clone git://github.com/rails/rails.git
Once the repository has finished downloading you can switch to the version we’re using here by checking out the appropriate branch.
$ git checkout v3.0.1
We’re mainly interested in the ActiveRecord code so we’ll move into the relevant directory.
$ cd activerecord/lib/active_record
The code for ActiveRecord is pretty large and is contained over a number of files. We’ll only be looking at a few of these in this episode.
$ ls -F aggregations.rb nested_attributes.rb association_preload.rb observer.rb associations/ persistence.rb associations.rb query_cache.rb attribute_methods/ railtie.rb attribute_methods.rb railties/ autosave_association.rb reflection.rb base.rb relation/ callbacks.rb relation.rb connection_adapters/ schema.rb counter_cache.rb schema_dumper.rb dynamic_finder_match.rb serialization.rb dynamic_scope_match.rb serializers/ errors.rb session_store.rb fixtures.rb test_case.rb locale/ timestamp.rb locking/ transactions.rb log_subscriber.rb validations/ migration.rb validations.rb named_scope.rb version.rb
Experimenting In the Console
Before we dive in to the code let’s get a better idea of what we’re searching for by experimenting in the console of a Rails 3 application. The application we’re using here is a simple todo list app with several Task
models. We can get all of the tasks by using Task.all
:
> Task.all => [#<Task id: 1, project_id: 1, name: "paint fence", completed_at: nil, created_at: "2010-11-08 21:25:05", updated_at: "2010-11-08 21:32:21", priority: 2>, #<Task id: 2, project_id: 1, name: "weed garden", completed_at: nil, created_at: "2010-11-08 21:25:29", updated_at: "2010-11-08 21:27:04", priority: 3>, #<Task id: 3, project_id: 1, name: "mow lawn", completed_at: nil, created_at: "2010-11-08 21:25:37", updated_at: "2010-11-08 21:26:42", priority: 3>]
The new Active Record query syntax makes it simple to, say, get all of the tasks with a priority of 3
.
> Task.where(:priority => 3) => [#<Task id: 2, project_id: 1, name: "weed garden", completed_at: nil, created_at: "2010-11-08 21:25:29", updated_at: "2010-11-08 21:27:04", priority: 3>, #<Task id: 3, project_id: 1, name: "mow lawn", completed_at: nil, created_at: "2010-11-08 21:25:37", updated_at: "2010-11-08 21:26:42", priority: 3>]
What’s returned by this query looks like an array of records but if we call class
on it we’ll see that it’s actually an instance of ActiveRecord::Relation
.
> Task.where(:priority => 3).class => ActiveRecord::Relation
If we add another option to the query and call class
on that we’ll get an object of the same type returned.
> Task.where(:priority => 3).limit(2).class => ActiveRecord::Relation
The Relation Class
Having queries return an ActiveRecord::Relation
object allows us to chain queries together and this Relation
class is at the heart of the new query syntax. Let’s take a look at this class by searching through the ActiveRecord source code for a file called relation.rb
.
At the top of the class a number of constants are defined, one of which is a Struct
. If you’re not familiar with structs these are a way of quickly defining a class dynamically by passing in a list of attributes in the constructor.
require 'active_support/core_ext/object/blank' module ActiveRecord # = Active Record Relation class Relation JoinOperation = Struct.new(:relation, :join_class, :on) ASSOCIATION_METHODS = [:includes, :eager_load, :preload] MULTI_VALUE_METHODS = [:select, :group, :order, :joins, :where, :having] SINGLE_VALUE_METHODS = [:limit, :offset, :lock, :readonly, :create_with, :from] include FinderMethods, Calculations, SpawnMethods, QueryMethods, Batches
Next the class includes a number of modules and these modules contain most of the class’s features. The modules’ files are contained in a relation
directory within the active_record
directory. We’ll take a look at one of these now: query_methods.rb
.
This class contains the methods that we use in the new query syntax: includes
, select
, group
, order
, joins
and so on. All of these methods behave very similarly here, calling clone
. This clones the Relation
object, returning a new Relation
object rather than altering the existing one. They then call tap
on the cloned object which returns the object after the block has executed on it. In each block we add the arguments that are passed into the method to the appropriate set of values in the Relation
object.
def group(*args) clone.tap {|r| r.group_values += args.flatten if args.present? } end def order(*args) clone.tap {|r| r.order_values += args if args.present? } end def reorder(*args) clone.tap {|r| r.order_values = args if args.present? } end
So earlier when we called Task.where(:priority => 3)
in the console it returned a instance of Relation
and when we called limit(2)
on that Relation
the limit
method in the QueryMethods
module was called and returned a cloned Relation
object. But what about the initial call to where
? We know that limit
is being called on a Relation
but what about the where
call? This is called directly on the Task
model and therefore on ActiveRecord::Base
rather than Relation
so where is the initial Relation
object created?
To answer this we’ll search through the ActiveRecord source code. If we search for “def where
” we’ll find a match, but only in the QueryMethods
module we were just looking in. A search for “def self.where
” returns nothing either. Another way that methods can be defined is with the delegate
keyword and if we search the code with the regular expression “delegate.+ :where
” we’ll get some interesting results.
The second match delegates a lot of query methods and it looks like this is what we’re after.
delegate :select, :group, :order, :reorder, :limit, :joins, :where, :preload, :eager_load, :includes, :from, :lock, :readonly, :having, :create_with, :to => :scoped
This line lists all of the query methods and delegates them all to scoped
. So, what does scoped
do? If we search across the project again we’ll find this method in the named_scope
.rb file.
The NamedScope
module is included in ActiveRecord::Base
so we have access to all of its methods in there. The scoped
method is fairly simple, calling relation
and then merging in any options that it has into that.
def scoped(options = nil) if options scoped.apply_finder_options(options) else current_scoped_methods ? relation.merge (current_scoped_methods) : relation.clone end end
Let’s look next at the relation
method which is defined in ActiveRecord::Base
.
private def relation #:nodoc: @relation ||= Relation.new(self, arel_table) finder_needs_type_condition? ? @relation.where(type_condition) : @relation end
Here is where the Relation
object is instantiated. We pass it self
, which is an ActiveRecord model class and arel_table
, which is an Arel::Table
object. The method then returns that Relation
. (The condition that adds a some where
conditions first is related to single-table inheritance.) The arel_table
method is defined in the same class and just creates a new Arel::Table
object.
def arel_table @arel_table ||= Arel::Table.new(table_name, arel_engine) end
Arel
The question now is “what is Arel”? Arel is an external dependency so we won’t find it in the Rails source code, but it’s worth taking a look at the source, which can be found on Github. Arel is a framework that simplifies the generation of complex SQL queries and ActiveRecord uses this to do just that, like this:
users.where(users[:name].eq('amy')) # => SELECT * FROM users WHERE users.name = 'amy'
Now that we know what an Arel::Table
is we can go back to the relation
method. This returns a Relation
object so let’s take a look at the Relation
class. The initializer for this class just takes in the class and table that are passed to it and stores them in an instance variable.
Back in the Rails console we now know know what happens when we call
Task.where(:priority => 3).limit(2).class
A new Relation
object is created when we call where and when we call limit
on that the relation is cloned and the additional arguments are added and stored in the cloned object. When we call class
on this the query isn’t performed, but if we remove .class
from the end of the command the query will be run and we’ll see a list of objects returned.
> Task.where(:priority => 3).limit(2) => [#<Task id: 2, project_id: 1, name: "weed garden", completed_at: nil, created_at: "2010-11-08 21:25:29", updated_at: "2010-11-08 21:27:04", priority: 3>, #<Task id: 3, project_id: 1, name: "mow lawn", completed_at: nil, created_at: "2010-11-08 21:25:37", updated_at: "2010-11-08 21:26:42", priority: 3>]
The query must be performed somewhere and what’s happening behind the scenes in the console is that inspect
is called on the command that is being run. Relation
overrides the default inspect
method. Let’s take a look at what the overridden method does.
def inspect to_a.inspect end
All that inspect
does here does is call to_a.inspect
on the relation. Following the code in Relation
the to_a
method looks like this:
def to_a return @records if loaded? @records = eager_loading? ? find_with_associations : @klass.find_by_sql(arel.to_sql) preload = @preload_values preload += @includes_values unless eager_loading? preload.each {|associations| @klass.send(:preload_associations, @records, associations) } # @readonly_value is true only if set explicitly. @implicit_readonly is true if there # are JOINS and no explicit SELECT. readonly = @readonly_value.nil? ? @implicit_readonly : @readonly_value @records.each { |record| record.readonly! } if readonly @loaded = true @records end
This method returns the records if they already exist, otherwise it fetches them and then returns them. The interesting part of this method is the part that fetches the methods, specifically this part: @klass.find_by_sql(arel.to_sql)
. This code calls find_by_sql
on a model, in this case our Task
model and passes in arel.to_sql
. The arel
method that is used here is defined in the QueryMethods
module that we saw earlier. All this method does is call another method called build_arel
and cache the result into an instance variable and it’s in the build_arel
method where all of the work takes place.
def build_arel arel = table arel = build_joins(arel, @joins_values) unless ↵ @joins_values.empty? (@where_values - ['']).uniq.each do |where| case where when Arel::SqlLiteral arel = arel.where(where) else sql = where.is_a?(String) ? where : where.to_sql arel = arel.where(Arel::SqlLiteral.new("(#{sql})")) end end arel = arel.having(*@having_values.uniq.select{|h| h.present?}) unless @having_values.empty? arel = arel.take(@limit_value) if @limit_value arel = arel.skip(@offset_value) if @offset_value arel = arel.group(*@group_values.uniq.select{|g| g.present?}) unless @group_values.empty? arel = arel.order(*@order_values.uniq.select{|o| o.present?}) unless @order_values.empty? arel = build_select(arel, @select_values.uniq) arel = arel.from(@from_value) if @from_value arel = arel.lock(@lock_value) if @lock_value arel end
This method fetches the Arel::Table
that we saw earlier and then builds up a query, converting all of the data that we’ve been storing inside the Relation
object and converting them into an Arel query which it then returns. Back in the Relation
class the to_a
method calls to_sql
on this Arel query to convert it to SQL and then calls find_by_sql
on the model so that an array of the appropriate records is returned.
Now that we have a basic understanding of how this class works there are a lot of other methods that we can explore by browsing the code in Relation
. For example the create method calls another method called scoping
and calls create
on the @klass
. This will create a new instance of a model and the scoping method will add itself to @klass
’s scoped methods. What this means is that anything executed inside a scoping block will be scoped as if it were called directly on that relation object. The modules are worth exploring too, especially QueryMethods
. There are a number of methods in there that you may not be aware of for example reorder
which will reset the order arguments rather than appending to them as order
does.
def order(*args) clone.tap {|r| r.order_values += args if args.present? } end def reorder(*args) clone.tap {|r| r.order_values = args if args.present? } end
There is also a reverse_order
method that will reverse the order of the order
clause.
The Calculations
module contains methods for performing calculations on fields such as average
, minimum
and maximum
. The SpawnMethods
module is interesting because it allows you to interact with separate Relation
objects, for example merging two relations. There are also except
and only
methods which we’ve not had time to experiment with yet. The best way to determine what these methods do is to open up the console of a Rails 3 application and try these methods out to see what they do. You can learn a lot of interesting techniques by browsing the code and experimenting with the methods you find in it.
That’s it for this episode on the internals of ActiveRecord::Relation
. We encourage you to browse the Rails source code and experiment with any methods that you find that look interesting.