Method chaining is a convenient way to build up complex queries, which are then lazily executed when needed. Within the chain, a single object is updated and passed from one method to the next, until it’s finally transformed into its output value.
Object-relational mappers like Rails’ ActiveRecord use method chaining in their query interfaces to build up a database query by accepting multiple finder methods like #where
, #order
and #limit
.
To illustrate the method chaining concept, the Collection
class is a query interface that mimics an ORM by querying data from an in-memory array of hashes instead of connecting to a database.
The data set is a list of animals, with their common names, classes and types:
data = [ { name: "Blue whale", class: "mammalia", type: "aquatic" }, { name: "European lobster", class: "malacostraca", type: "aquatic" }, { name: "South-American tapir", class: "mammalia", type: "terrestrial" } ]
The query interface exposes the #where
method to query the dataset, which allows multiple calls to combine multiple clauses.
For example, to get all aquatic mammals from the database, pass the #where
clause twice:
require "./collection" Collection .new(data) .where(class: "mammalia") .where(type: "aquatic") .to_a
[{:name=>"Blue whale", :class=>"mammalia", :type=>"aquatic"}]
A note on Enumerator::Lazy
This article explains how method chaining and lazy evaluation work internally and uses an in-memory array as a minimal example.
Since arrays are enumerable, filtering methods can be chained on them through Enumerator::Lazy
instead of writing a custom implementation:
data .lazy .select { |item| item[:class] == "mammalia" } .select { |item| item[:type] == "terrestrial" } .to_a
[{:name=>"South-American tapir", :class=>"mammalia", :type=>"terrestrial"}]
For most real world use cases, using Ruby’s built in lazy enumeration is the best solution.
However, implementing a custom method chain is useful to write query interfaces on objects that do not implement Enumerable
, or when the query needs to be executed in a single pass (unlike the example above, which filters the array twice; once for each call to Enumerable#select
).
The Collection
class
For a custom implementation of method chaining, first implement the #where
method on a new class named Collection
.
A test to verify the behavior of selecting an item from a collection looks like this:
require 'minitest/autorun' require './collection' class TestCollection < MiniTest::Test def test_where whale = { name: "Blue whale", class: "mammalia", type: "aquatic" } lobster = { name: "European lobster", class: "malacostraca", type: "aquatic" } tapir = { name: "South-American tapir", class: "mammalia", type: "terrestrial" } collection = Collection.new([whale, lobster, tapir]) assert_equal([whale], collection.where(name: "Blue whale")) end end
The test asserts that given a collection with two items, one result is selected which matches the #where
clause.
Running the test produces errors stating that the called methods don’t exist until we’ve implemented part of the Collection
class:
class Collection def initialize(data) end def where(where) end end
After adding a Collection
class that has a stub for both of the called methods, the test errors turn into an assertion failure because the #where
method returns nil
instead of the asserted result:
ruby collection_test.rb
Run options: --seed 45181 # Running: F Finished in 0.010833s, 92.3105 runs/s, 92.3105 assertions/s. 1) Failure: TestCollection#test_where [collection_test.rb:26]: --- expected +++ actual @@ -1 +1 @@ -[{:name=>"Blue whale", :class=>"mammalia", :type=>"aquatic"}] +nil 1 runs, 1 assertions, 1 failures, 0 errors, 0 skips
An implementation for #where
An implementation that satisfies the test looks like this:
class Collection def initialize(data) @data = data end def where(where) @data.select { |item| where <= item } end end
The #initialize
method takes the data from its passed argument, and stores it in an instance variable named @data
to be used later.
Because @data
is an array, the #where
method calls Array#select
on it to retrieve a selection of its contents.
It passes a block that checks each item with Hash#<=
, which checks if the where
variable is a subset of the item
hash.
ruby collection_test.rb
Run options: --seed 13425 # Running: . Finished in 0.000539s, 1855.2876 runs/s, 1855.2876 assertions/s. 1 runs, 1 assertions, 0 failures, 0 errors, 0 skips
Chaining #where
calls
The Collection
class needs to chain methods to join multiple where clauses.
The updated test suite adds a test that asserts that the #where
method can be chained.
It also moves the collection to a setup
block, as it’s now used by both tests:
require 'minitest/autorun' require './collection' class TestCollection < MiniTest::Test def setup @whale = { name: "Blue whale", class: "mammalia", type: "aquatic" } @lobster = { name: "European lobster", class: "malacostraca", type: "aquatic" } @tapir = { name: "South-American tapir", class: "mammalia", type: "terrestrial" } @collection = Collection.new([@whale, @lobster, @tapir]) end def test_where assert_equal([@whale], @collection.where(name: "Blue whale")) end def test_where_multiple assert_equal( [@tapir], @collection.where(class: "mammalia").where(type: "terrestrial") ) end end
The new test fails because the second call to #where
is executed on the result of the first:
ruby collection_test.rb
Run options: --seed 33677 # Running: .E Finished in 0.000555s, 3603.6036 runs/s, 1801.8018 assertions/s. 1) Error: TestCollection#test_where_multiple: NoMethodError: undefined method `where' for [{:name=>"Blue whale", :class=>"mammalia", :type=>"aquatic"}, {:name=>"South-American tapir", :class=>"mammalia", :type=>"terrestrial"}]:Array collection_test.rb:34:in `test_where_multiple' 2 runs, 1 assertions, 0 failures, 1 errors, 0 skips
In the current implementation, the selection is immediately executed when the first #where
is called on the collection.
The implementation needs to combine the received #where
clauses, and introduce laziness to execute the selection at the last possible moment.
This example combines the received where clauses by merging them into an internal @where
instance variable:
class Collection def initialize(data) @data = data @where = {} end def where(where) @where.merge!(where) self end end
After merging a received where clause into the internal variable, the where
method returns self
.
This allows for chaining more calls to #where
at the end, but removes the filtering, meaning the function does not return filtered data anymore.
ruby collection_test.rb
Run options: --seed 35630 # Running: FF Finished in 0.015616s, 128.0738 runs/s, 128.0738 assertions/s. 1) Failure: TestCollection#test_where [collection_test.rb:28]: --- expected +++ actual @@ -1 +1 @@ -[{:name=>"Blue whale", :class=>"mammalia", :type=>"aquatic"}] +#<Collection:0xXXXXXX @data=[{:name=>"Blue whale", :class=>"mammalia", :type=>"aquatic"}, {:name=>"European lobster", :class=>"malacostraca", :type=>"aquatic"}, {:name=>"South-American tapir", :class=>"mammalia", :type=>"terrestrial"}], @where={:name=>"Blue whale"}> 2) Failure: TestCollection#test_where_multiple [collection_test.rb:32]: --- expected +++ actual @@ -1 +1 @@ -[{:name=>"South-American tapir", :class=>"mammalia", :type=>"terrestrial"}] +#<Collection:0xXXXXXX @data=[{:name=>"Blue whale", :class=>"mammalia", :type=>"aquatic"}, {:name=>"European lobster", :class=>"malacostraca", :type=>"aquatic"}, {:name=>"South-American tapir", :class=>"mammalia", :type=>"terrestrial"}], @where={:class=>"mammalia", :type=>"terrestrial"}> 2 runs, 2 assertions, 2 failures, 0 errors, 0 skips
The tests fail, because the #where
method no longer produces the asserted result.
Resolving the selection
Being lazy, the method chain needs to eventually be resolved to produce results instead of returning self
.
This version of the tests forces the results into an array by calling the #to_a
method on the end of each #where
-chain:
require 'minitest/autorun' require './collection' class TestCollection < MiniTest::Test def setup @whale = { name: "Blue whale", class: "mammalia", type: "aquatic" } @lobster = { name: "European lobster", class: "malacostraca", type: "aquatic" } @tapir = { name: "South-American tapir", class: "mammalia", type: "terrestrial" } @collection = Collection.new([@whale, @lobster, @tapir]) end def test_where assert_equal([@whale], @collection.where(name: "Blue whale").to_a) end def test_where_multiple assert_equal( [@tapir], @collection .where(class: "mammalia") .where(type: "terrestrial") .to_a ) end end
Running the tests now produces more errors because Collection
does not implement the #to_a
method.
By reusing the code from a previous version of the #where
method, this implementation of #to_a
uses the @where
instance variable to execute the query:
class Collection def initialize(data) @data = data @where = {} end def where(where) @where.merge!(where) self end def to_a @data.select { |item| @where <= item } end end
Running the test again shows both tests passing:
ruby collection_test.rb
Run options: --seed 16140 # Running: .. Finished in 0.000800s, 2500.0000 runs/s, 2500.0000 assertions/s. 2 runs, 2 assertions, 0 failures, 0 errors, 0 skips