Task queues and why do we need them.
Sarbik Betal
Posted on June 21, 2020
Cover photo: ยฉUnsplash/Camille Chen
Some Background:
What is a task queue and why do you need it?
Analogy
Well, to answer that question let's consider a scenario.
There's a restaurant, and the restaurant has several employees (let's say 10) like waiters, chefs, cashier, receptionist, manager, etc. Now just recall what happens in a restaurant when you place your order.
- You inform what you require ๐ฃ๏ธ. (Request)
- The waiter notes it down ๐, and assures you that your food will be ready in a while ๐๏ธ. (Acknowledge)
- The waiter passes you order to a chef ๐งโ๐ณ, and the chef adds it to the list of orders. (Enqueue)
- Then the waiter goes to take orders from another customer ๐ฅ. (Next request).
- Multiple chefs may be preparing the food ๐ฅช from the list of orders, one by one or may be many at a time โ. (Process)
- After a while when your food is ready, the chef calls the waiter and passes the food ๐ฅช. (Dequeue)
- The waiter comes and serves you the food ๐. (Response)
- Then the waiter goes to some another customer. (Next request).
The waiter and the chef are decoupled from one another, and the waiter takes orders and the chef prepares food independently.
Now imagine the same scenario where all the employees were capable of doing all kinds of jobs (take order, cook, etc.).
If that would have been the case, then the work flow would have changed to something like this.
- A waiter arrives, takes your order ๐ and tells you that your food will be ready.
- The same waiter goes to the kitchen ๐ with your order and starts preparing them ๐งโ๐ณ.
- When he/she is done preparing your food, comes back ๐ and serves you the food ๐ฅช.
You might not see much of a problem here, do you? Well think again, the restaurant has 10 employees only, what would happen if there are 20 or 25 customers waiting to order food?
The former way of handling the orders will easily deal with the pressure. But the latter would just break ๐ง, because if all the employees are busy preparing food for the first 10 customers, who ๐ป is gonna take orders from the remaining customers? And if the new customers are not addressed within a few minutes, they will surely leave ๐ .
Where do we need them?
When we are building web applications/services ๐ฅ๏ธ that does some heavy-lifting in the server that takes time (anything over a few milliseconds) or is a long running job โฑ๏ธ unlike simple CRUD operations like complex calculation, file handling or data analysis, we should always use a task queue. You can think of this as asynchrony (like Promises or Async-await in js) taken to the next level. This would help us to enqueue the task for processing and send the client some kind of acknowledgement immediately before we do the actual processing โ๏ธ and move on to the next request (like the waiter). Another server (or maybe the same server which spins off another worker instance/process) would just check for the list ๐ if there is any pending task and process them (like the chef). Once it's done with a job, it will acknowledge the API server which would communicate to the client that the job is done โ๏ธ (through web-sockets, push notifications, emails or whatever implementation you could think of).
Now if it happens to process the job in one go with your API server (like the restaurant in the second case), things will get really sluggish โฑ๏ธ because the server will take your request, process it, do the heavy lifting ๐๏ธ(which takes time) and respond you back, all in one go. This means that the client would have to wait while the entire operation is complete and your browser will load on and on ๐ till the server sends the response and if anyone sends a request in between would have to wait for the server to finish the first request before it can even address the second one and then send back the response. Now imagine the same case for thousands of requests per second, that would be really slow and painful and you can imagine that it would result in a very bad UX ๐ .
How do we make it work?
Before getting into the the details of using a task queue, let me introduce some of the terms used extensively over the context of this series.
- Queue - Queues are like actual queues in which similar jobs/tasks are grouped together waiting to be processed by a worker in a FIFO (first in first out) manner.
- Jobs/Tasks - They are the objects which contain the actual details about the job that is waiting to be processed.
- Publisher - It is the one who adds the task in a queue.
- Consumer - It watches the job queue for any pending job and sends it for processing.
- Worker - The actual powerhouse which processes the job and notifies if it was successful or not. The worker logic can be housed inside of the consumer if you wish to do so.
Working of a task queue. ยฉ Miguel Grinberg
Now that you have a basic overview, let's get into the details.
- First we set up an API server with some endpoints which would respond to the client's HTTP requests.
- The API server publishes the job to its respective queue and sends some kind of acknowledgement to the client like ```json
{
"job": "conversion",
"id": "dcj32q3",
"status": "ok"
}
or in case it fails
```json
{
"job": "conversion",
"id": "dcj32q5",
"status": "failed",
"reason": "auth_failed"
}
and closes the connection.
- A consumer watches and consumes the queue and sends the task for processing to a worker.
- The worker processes the job (one or many at a time), reports the
progress
in between (if it wishes to) and dispatches an event once it is done with the job. You may note that the task can fail at this stage also, so it dispatches asuccess
or afailure
event which can be handled accordingly. - The API server queries the
progress
and reports it to the client (through web-sockets or polling XHR/Fetch requests) so that the application can show a nice progress bar in the UI. - It also listens for the
success
orfailure
events and sends a notification to the client. - The client can now request the resource through another API call and the server responds with the requested resource to the client and closes the connection.
This way the clients are assured immediately that
Hey, I'm working on your job. I'll notify you once it is done, in the meanwhile you can do some other stuff.
and no one has to keep waiting for long and the server can efficiently handle more incoming requests.
The task queue essentially glues all these pieces (the API server and the workers) and makes them work together shifting the load from the API server to the worker and thus ensuring a much lower response time and lower down-time.
Conclusion
Hurray! ๐, now you hopefully understand the basics of a task queue, why do we need them and what are its advantages โจ. If you think about it, this architecture is highly scalable (horizontally) and increased demand can be addressed by adding more worker processes.
I hope this post was helpful for beginners and if you liked this article please show some love, give it a ๐ and stay tuned ๐ป for more.
Please comment down below if you have any questions or suggestions and feel free to reach me out ๐.
In the next article, we will see a step by step guide on how to setup a simple task queue in node js
Posted on June 21, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
September 26, 2024