Delving into the Black Magic of GraphQL DataLoader! 🌌✨
Hleb Bandarenka
Posted on January 16, 2024
Who should read this?
If you've got GraphQL experience, encountered the N+1 problem, and used DataLoader to solve it but are still unclear on how it works, you're in the right spot.
Prerequisites:
- Familiarity with the DataLoader pattern
- Understanding of the Event Loop
Let's rock 🚀🤘🎸
When I began working with GraphQL, I had concerns about the N+1 query problem. In my research, I came across the DataLoader pattern and its implementation on GitHub. While I explored various examples of its usage, I still struggled to grasp how it operates internally. Join me in delving a bit deeper into GraphQL DataLoader! :)
I hope my readers are already familiar with how the EventLoop works. If not, I highly recommend checking out this insightful series of articles with excellent visualizations here.
For our current discussion, pay special attention to Part 2 - Bonus experiment. This experiment demonstrates that the nextTick operation triggered inside a Promise will execute after all other Promises have been completed.
Why is this crucial? ❗️❗️❗️
This experiment illustrates that the nextTick
operation triggered inside a Promise
will be executed after all other Promises
are completed. The emphasis on this aspect is crucial because DataLoader
leverages this peculiarity to perform its magic in the enqueuePostPromiseJob
. 🎩✨
// `enqueuePostPromiseJob` function
...
if (!resolvedPromise) {
resolvedPromise = Promise.resolve();
}
resolvedPromise.then(() => {
process.nextTick(fn);
});
...
When you create a DataLoader
, you provide a BatchLoadFn
as a constructor argument. The enqueuePostPromiseJob
serves as the default batchScheduleFn
, responsible for scheduling when the BatchLoadFn
(the fn
argument is dispathBatch -> BatchLoadFn
) should be invoked.
This process kicks off when you first invoke the load
method on DataLoader. That's when enqueuePostPromiseJob
starts its job. 🤯
Did my message make sense? I'm a bit unclear myself.
I hope the schema I provided will help clarify what I wanted to say.
What does it mean for us? 🤔
It signifies that DataLoader gathers all IDs passed during synchronous invocation, including those within nextTick
and Promises
, even if the Promise
IDs were defined within another Promise
. However, it doesn't include nextTick
IDs added inside a Promise
.
P.S. This is true if the initial load was invoked synchronously; otherwise, all IDs will be collected.
index.js
const DataLoader = require("dataloader");
const db = require("./database");
// Create a batch loading function
async function batchLoadFunction(ids) {
const results = await db.findAll(ids);
// Return the results in the same order as the keys
return ids.map((key) => results.find((result) => result.id === key));
}
// Create a new DataLoader instance
const dataLoader = new DataLoader(batchLoadFunction);
// Use the DataLoader to load data
(async () => {
// 1. Sync calls
const p1 = dataLoader.load(1);
const p2 = dataLoader.load(2);
Promise.all([p1, p2]).then((results) => {
console.log(results);
});
// 2. Next tick calls
process.nextTick(() => {
console.log("next tick");
const p3 = dataLoader.load(3);
const p4 = dataLoader.load(4);
Promise.all([p3, p4]).then((results) => {
console.log(results);
});
});
// 3. Promise calls
Promise.resolve().then(() => {
console.log("promise");
const p5 = dataLoader.load(5);
const p6 = dataLoader.load(6);
Promise.all([p5, p6]).then((results) => {
console.log(results);
});
// 4. Next tick inside promise
process.nextTick(() => {
console.log("next tick inside promise");
const p7 = dataLoader.load(7);
const p8 = dataLoader.load(8);
Promise.all([p7, p8]).then((results) => {
console.log(results);
});
});
// 5. Promise inside promise
Promise.resolve().then(() => {
console.log("promise inside promise");
const p9 = dataLoader.load(9);
const p10 = dataLoader.load(10);
Promise.all([p9, p10]).then((results) => {
console.log(results);
});
});
});
})();
Result:
next tick
promise
promise inside promise resolve handle
Querying ids: 1,2,3,4,5,6,9,10
next tick inside promise resolve handle
[ { id: 1, name: 'John', age: 25 }, { id: 2, name: 'Jane', age: 30 } ]
[ { id: 3, name: 'Bob', age: 35 }, { id: 4, name: 'Alice', age: 28 } ]
[ { id: 5, name: 'Mike', age: 32 }, { id: 6, name: 'Sarah', age: 27 } ]
[ { id: 9, name: 'Michael', age: 31 }, { id: 10, name: 'Sophia', age: 26 } ]
Querying ids: 7,8
[ { id: 7, name: 'David', age: 33 }, { id: 8, name: 'Emily', age: 29 } ]
Clearly, the next tick inside promise
was triggered after querying the BatchLoadFn
. Consequently, the IDs from that nextTick
joined the second invocation of BatchLoadFn
.
How can we use it?
Soooo, if we incorporate DataLoader
within Promises
, everything will function as anticipated. Now, we have a clear understanding of the reasons behind it. 😊🎉
I hope this post has shed some light on the subject. If you're keen on a more in-depth understanding, I encourage you to take a look at the source code yourself. At the very least, you now have a solid foundation of understanding.
Bonus
How does this pattern operate in other languages, like Java? Unfortunately, due to Java's threading mechanism, where all threads share equal priority, the solution is not as elegant and necessitates manual dispatching.
dataloader.load("A");
dataloader.load("B");
dataloader.load("A");
dataloader.dispatch(); // in Java you have to manually invoke `dispatch` function
Posted on January 16, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.