A journey to optimize a SpringData SQL query with N+1s
Loïc Chau
Posted on November 20, 2024
Some context
While browsing my app in pre-production, I noticed that the homepage was starting to take a long time to load...
Investigating in my APM
I looked in Kibana's APM (Application Performance Management) and...
A route with 13s and another with 6.5s average latency?????? 🤯
That's much more than I'd want...
The N+1 problem
It quickly becomes clear that we have a pretty severe N+1 problem.
To help you better picture the DB schema :
In addition to the expected SELECT
query to retrieve a project line in DB, we execute 849 additional SELECT
queries on Relation_1, which takes 3.6s out of the 4.2s total duration of the HTTP call 🫠
The worst part : I don't even need the Relation_1 information in this endpoint 🥲 so I'm just wasting ressources
We can't see it on the capture but there are another 200s SELECT
to get data from Relation_2, so we have a double N+1 queries issue.
The EAGER
toxic relation
The culprit in the code was the FetchType.EAGER
on the Relation1 and Relation2 relationships (or so I thought at this point).
Misconception: A FetchType.EAGER helps avoid N+1 issues because it fetches everything at once, so JPA must be optimizing and performing JOINs.
Reframe: FetchType.EAGER ensures that database reads are transactional but not optimized. In fact, using FetchType.EAGER guarantees that you’re always in the worst-case scenario for N+1 issues 😱 → which was exactly my case.
After looking around in Baeldung I discovered Hibernate's FetchMode
:
In general, FetchMode defines how Hibernate will fetch the data (by select, join or subselect). FetchType, on the other hand, defines whether Hibernate will load data eagerly or lazily.
The exact rules between these two are as follows:
- if the code doesn’t set FetchMode, the default one is JOIN and FetchType works as defined
- with FetchMode.SELECT or FetchMode.SUBSELECT set, FetchType also works as defined
- with FetchMode.JOIN set, FetchType is ignored and a query is always eager
It seemed to me like Hibernate uses optimized parameters since it uses JOINs by default. I also thought that it would use FetchType.LAZY by default.
False assumption (but I didn’t know it at the time): Hibernate is well-designed, and by default, all joins have the best default settings: FetchMode.JOIN and FetchType.LAZY.
Reality: This is actually false (as we’ll see later).
❌ Removing the FetchType.EAGER
and, if needed, replace it with FetchMode.JOIN
(result: -0s).
I kept some FetchMode.JOIN
annotations to optimize queries in cases where we’re certain to always fetch the tables together. (Reality: otherwise, my tests broke 🤪).
Under the hood, FetchMode.JOIN
forces FetchType.EAGER
, but it optimizes the query by performing a join as expected, so in theory we should have no more N+1 queries.
However, in practice, FetchMode.JOIN
is essentially useless in my app... It’s ignored because we fetch data with SpringData 😩 I guess it is therefore essentially the same as having a FetchType.EAGER
Spring data internally use Criteria API for the findAll() query , so @Fetch(value = FetchMode.JOIN) will not have any effect.
StackOverflow thread for more info
Results
Even though FetchMode.JOIN
doesn't work, at least the relations where I removed the fetchType=EAGER
should now be LAZY
and my N+1 problem should improve... Right ??? But nothing has changed 😶
Why ????
Actually, the @OneToOne
relationship defaults to FetchType.EAGER
💀 as you can see in the source code of the annotation.
So, we should explicitly set FetchType.LAZY
on the relationship…
But I think a much more interesting question is: why, in the name of Jesus of Nazareth, doesn’t Spring Data generate a join for us??? We’ll still try both approaches to see what happens.
Setting FetchType.LAZY
on the @OneToOne ❌
The problem occurs when fetching projects, so I tried setting FetchType.LAZY
on ProjectModel
Relation_1 relationship. But… Project
isn’t the owner of the relationship 💀 which means the fetch is still EAGER!!! AAAAAAAAAAAH 🗿 (thanks IntelliJ for pointing a out this issue with a warning by the way)
Unless you are using Bytecode Enhancement, you cannot fetch lazily the parent-side @OneToOne association.
StackOverflow thread - How can I make a JPA OneToOne relation lazy
I hope I won't need to make Bytecode enhancement because it seems a bit overkill for my use case 🙃
💡 Learning moment: Defining a @OneToOne relationship on both sides when it’s not necessary is a bad idea because the non-owning side will always default to FetchType.EAGER, leading to a ton of N+1 issues 💀
Gladly, in my case, a solution to the N+1 problem is to remove the @OneToOne
relationship annotation on Relation_1
in the Project
model since we never query the project alongside Relation_1
.
Or, alternatively, I could ensure the relation is properly fetched with a JOIN query — though this is less optimal because it will unnecessarily load the memory with unneeded Relation_1
information when fetching projects.
Kindly ask Spring Data to perform JOINs 🤗 (😠)
As we saw earlier, FetchMode.JOIN
is useless with Spring Data because it internally uses the Criteria API, which doesn’t consider FetchMode
(thank you Criteria API).
The solution is to use Entity Graphs 🥹🥹 the savior.
Entity Graph allows us to specify to Spring Data which joins it should make.
Example :
On your model :
@Entity
// Define an EntityGraph named "Project.forDomain"
@NamedEntityGraph(name = "ProjectModel.forDomain",
attributeNodes = {
// This entity graph specifies to eagerly fetch (with JOINs)
// the following property of the model
@NamedAttributeNode("relation2")
})
public class ProjectModel {
@OneToOne(mappedBy = "project", fetch = FetchType.LAZY)
public Relation2 relation2;
}
Then in the SpringData repository
@Repository
public interface ProjectJpaRepository {
// This method will use the entity graph "ProjectModel.forDomain"
// to know how to fetch the SQL data
@EntityGraph(value="ProjectModel.forDomain", type = EntityGraph.EntityGraphType.FETCH)
List<ProjectModel> findBySomeProperty(String someProperty);
}
Little bit of thinking 🤔
Should we use type = EntityGraphType.LOAD
or type = EntityGraphType.FETCH
?
LOAD
= the properties of the graph are EAGER JOIN, while others follow their default fetch specs.
FETCH
= the properties of the graph are EAGER JOIN, while others are LAZY loaded.
I’ll prefer FETCH
for performance by default and specify EAGER JOINs if I need data to be fetched together.
StackOverflow thread - What is the difference between FETCH and LOAD for Entity graph of JPA?
Well, after all these twists and turns... the entity graph works! 🥹🥹🥹🥹🥹🥹🥹🥹🥹🥹
You didn’t think it was over, did you? 😊
Removing the relation1
property from projectModel
means we no longer benefit from orphanRemoval
🥲 so some of my tests broke.
In the end, it was pretty quick to fix though.
Conclusion
Removing the reference to relation1
in ProjectModel
helped eliminate the N+1 issues on one of my routes, N+1 which were solely caused by the EAGER @OneToOne
on the non-owning side.
And the entity graph helped eliminate the N+1 issues on the other route 😍, which was plagued by the relationship with relation2
which wasn't done with a JOIN.
💡 Thinking back on how to debug SQL performance issues
To iterate: write SQL tests!!!!! It helps you iterate so much faster than by manually testing your app 😱
There is an initial cost to setup a test reproducing your issue but I swear it is worth it.
By enabling show-sql: true
in the application.yml
for the test environment, you can see the SQL queries that are being executed.
I hope you learned something from my optimization journey and had some fun as well 😄
If you have any questions ask in the comments I'll respond to the best of my capacities 😉
Posted on November 20, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
October 15, 2024
August 26, 2024