Understanding multiple joins in ActiveRecord
Ana Nunes da Silva
Posted on March 29, 2021
In part I and part II of this series, we used mostly single joins
and left_joins
to filter an ActiveRecord collection based on a direct association.
How can we filter a collection based on multiple associations?
Let's start by looking at our domain models: Host
, Accommodation
, Booking
, Payment
, Rating
, and Tag
.
class Host < ApplicationRecord
has_many :accommodations
end
class Accommodation < ApplicationRecord
belongs_to :host
has_many :bookings
has_many :tags
end
class Booking < ApplicationRecord
belongs_to :accommodation
has_many :ratings
end
class Payment < ApplicationRecord
belongs_to :booking
end
class Rating < ApplicationRecord
belongs_to :booking
end
class Tag < ApplicationRecord
belongs_to :accommodation
end
Host.all
Id | Name |
---|---|
100 | "Bennie Lubowitz" |
101 | "Soon Goyette" |
102 | "Aimee Douglas" |
Accommodation.all
Id | Name | Host Id |
---|---|---|
121 | "Charming House" | 100 |
122 | "Lisbon Flat" | 101 |
123 | "Le Petit Chalet" | 102 |
124 | "The Farm House" | 102 |
125 | "Beach House" | 100 |
Booking.all
Id | Accommodation Id | Check In | Check Out |
---|---|---|---|
62 | 121 | 20 Jun 2020 | 23 Jun 2020 |
63 | 122 | 07 Aug 2021 | 08 Aug 2021 |
64 | 123 | 22 May 2021 | 28 May 2021 |
65 | 124 | 01 Apr 2021 | 02 Apr 2021 |
Payment.all
Id | Amount | Booking Id |
---|---|---|
50 | 450 | 62 |
51 | 100 | 63 |
52 | 770 | 64 |
52 | 150 | 65 |
Rating.all
Id | Rate | Booking Id |
---|---|---|
25 | 4 | 62 |
26 | 3 | 64 |
27 | 5 | 63 |
Tag.all
Id | Name | Accommodations Id |
---|---|---|
10 | "Pool | 121 |
11 | "Countryside" | 121 |
12 | "Riverview" | 122 |
Multiple associations
Get all the bookings that have at least one payment and one rating
Looking at our models, we see that both payments and ratings are directly associated with bookings. The ActiveRecord method joins
accepts multiple direct associations, separated by a comma. So in our case, that will be:
Booking.joins(:payments, :ratings)
This will produce the following SQL:
SELECT "bookings".* FROM "bookings"
INNER JOIN "payments" ON "payments"."booking_id" = "bookings"."id"
INNER JOIN "ratings" ON "ratings"."booking_id" = "bookings"."id"
And return the following bookings:
Id | Accommodation Id | Check In | Check Out |
---|---|---|---|
62 | 121 | 20 Jun 2020 | 23 Jun 2020 |
63 | 122 | 07 Aug 2021 | 08 Aug 2021 |
64 | 123 | 22 May 2021 | 28 May 2021 |
Note that although there is a payment for booking id
65, there is no rating for this same booking, so it will not be included in the returned collection. If we'd have no ratings at all, no bookings would be returned. This means that combining inner joins is cumulative. Using our example, we will only return bookings that have payments AND ratings.
It could be the case though, that you need to return bookings with payments that might or might not have an associated rating. The 'might or might not have' condition calls for a left_joins
:
Booking.joins(:payments).left_joins(:ratings)
This will produce the following SQL:
SELECT "bookings".* FROM "bookings"
INNER JOIN "payments" ON "payments"."booking_id" = "bookings"."id"
LEFT OUTER JOIN "ratings" ON "ratings"."booking_id" = "bookings"."id"
And return the following bookings:
Id | Accommodation Id | Check In | Check Out |
---|---|---|---|
62 | 121 | 20 Jun 2020 | 23 Jun 2020 |
63 | 122 | 07 Aug 2021 | 08 Aug 2021 |
64 | 123 | 22 May 2021 | 28 May 2021 |
65 | 124 | 01 Apr 2021 | 02 Apr 2021 |
Now, all bookings with payments are included, even the booking id
65. This will return exactly the same ActiveRecord objects that Bookings.joins(:payments)
returns. The difference is that you will have ratings data available to query.
For instance, we can now do:
Booking.joins(:payments).left_joins(:ratings).where(ratings: { id: nil })
Which is the same as saying: get all the bookings that have payments but no ratings.
Nested joins
Get all hosts that have bookings for their accommodations
The premise here has a slight, but important, nuance from the first exercise. Before, both joined tables - payments and ratings - had a direct relationship with bookings.
But now we're dealing with nested associations - accommodations are directly associated with hosts but bookings are directly associated with accommodations, not to hosts.
So instead of passing two associations separated by a comma, we can pass the nested associations as a key-value pair:
Host.joins(accommodations: :bookings)
Notice the two INNER JOIN
s in the SQL below:
SELECT "hosts".* FROM "hosts"
INNER JOIN "accommodations" ON "accommodations"."host_id" = "hosts"."id"
INNER JOIN "bookings" ON "bookings"."accommodation_id" = "accommodations"."id"
We're first joining accommodations to hosts and then joining bookings to accommodations.
Let's have a look at the returned collection now:
Id | Name |
---|---|
100 | "Bennie Lubowitz" |
101 | "Soon Goyette" |
102 | "Aimee Douglas" |
Although not all accommodations have bookings (the "Beach House" doesn't), all hosts have at least one accommodation that has bookings.
Multi-level Nested joins
Get all hosts that have accommodations, where those accommodations have bookings with ratings, and where those accommodations also have tags
This case is also a bit different from the previously nested joins we analyzed. This time we will have two subsets of nested joins happening, both in relation to accommodations.
We will need to travel from accommodations to ratings, through bookings, and also from accommodations to tags. In ActiveRecord, that would be:
Host.joins(accommodations: [{ bookings: :ratings }], :tags)
Let's look at the SQL behind it to understand what's going on:
SELECT "hosts".* FROM "hosts"
INNER JOIN "accommodations" ON "accommodations"."host_id" = "hosts"."id"
INNER JOIN "bookings" ON "bookings"."accommodation_id" = "accommodations"."id"
INNER JOIN "ratings" ON "ratings"."booking_id" = "bookings"."id"
INNER JOIN "tags" ON "tags"."accommodation_id" = "accommodations"."id"
There's a first nesting that starts with accommodations, gets all their matching bookings, and then gets all the ratings that match the joined bookings.
Then there's a second nesting that gets the accommodations and their matching tags.
Run these two nested joins together and we get the two hosts that meet all these requirements:
Id | Name |
---|---|
100 | "Bennie Lubowitz" |
101 | "Soon Goyette" |
A few personal remarks and tips
Multiple joins can get complex very quickly. There a few tips that can help you make sense of what's going on at the SQL level:
Make use of the .to_sql
method
ActiveRecord can make simple queries easier to read but that's not so true for complex queries. If you're not sure what is the SQL your ActiveRecord query is building, you can call .to_sql
to get the returned SQL statement, similar to the SQL snippets I've added throughout these exercises.
Remember you can pass SQL strings to ActiveRecord
To avoid being lost in the ActiveRecord -> SQL translations, I usually pass SQL strings to the ActiveRecord query methods instead. I find it more explicit and flexible in complex queries.
Multiple joins can make your queries slow
My last advice is to always be aware of any potential performance issues when you're working with multiple joins. Joining tables can be a heavy job for both your database and ActiveRecord.
__
So far we've been using joins to filter data. In Part IV, the last one of this series, we'll see how to access the joined data through the returned ActiveRecord objects.
Posted on March 29, 2021
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.