Using a task queue vs. just not waiting for Promise to resolve

ccleary00

Corey Cleary

Posted on August 5, 2022

Using a task queue vs. just not waiting for Promise to resolve

Originally published at coreycleary.me. This is a cross-post from my content blog. I publish new content every week or two, and you can sign up to my newsletter if you'd like to receive my articles directly to your inbox! I also regularly send cheatsheets and other freebies.

When working with Node and JavaScript one of the benefits is that we can make code asynchronous, whether via callbacks or Promises. Instead of having to wait for a line of code to finish executing we can continue on if we don't await or .then() the Promise, or don't nest the callbacks if using those.

You are also likely aware of task queues, where instead of executing the code in your "main" service you create a job/task in a queue and a consumer watches the queue and it does the work rather than the "main" service. Rather than being a native asynchronous Node/JS thing, this is an asynchronous pattern at the architecture level.

Usually a task queue is used when you want to offload a longer running block of code and you don't need the results of that code in the rest of your code.
But if we can skip waiting for asynchronous JavaScript code to finish, and keep the code "fast" that way, doesn't that accomplish the same thing?
Why would you need a queue to begin with?

This is an important concept to understand especially as you become more "senior" and are making architecture decisions. So let's explore both and understand what the difference is / why you would want to use one option over the other.

Code processing

When you don't wait for the Promise to resolve, the most important thing to remember is that Node is still processing that Promise from the event loop. It's not like it disappeared, or was sent to some magic factory that does the work for free.
So even if you don't wait for resolution, your server is still executing that code. This is important to point out because you may have a scenario where that execution is computationally expensive (using lots of CPU and/or Memory).
So even if you don't wait for it to complete, server performance will be something you need to factor in.

Imagine you have a computationally intensive task like image processing, where when that is executed in the Node event loop it bogs down your server.
This is a prime candidate for something that should be pushed to a task queue. You're offloading that computationally expensive somewher else, again you can't avoid it. But that work is no longer in the main service bogging it down, and instead you can more immediately return the response to the user. And you can now scale up or down consumers (the "services" executing the code) to essentially load balance the work.

Error handling when not waiting for Promise resolution

This is probably a good time to discuss another important consideration when not waiting for Promise resolution.
If the Promise rejects, you still need to catch it. If you don't you'll get an Unhandled promise rejection error.

The most "local" way to do that is to use .catch(), like so:

async function test() {
  // artificial rejection just to demonstrate
  return Promise.reject('this is a rejection')
}

// notice, NO .then() or await
test().catch((err) => {
  // handle Promise rejection here
  console.error(err)
})
Enter fullscreen mode Exit fullscreen mode

Note that you can't use try/catch here like so:

try {
  test()
} catch (err) {
  console.error(err)
}
Enter fullscreen mode Exit fullscreen mode

In a try/catch even without await it will result in an uncaught Promise error. There's not another way of doing this with try/catch that I'm aware of.

You could also use a "top-level" as opposed to "local" error handler, something like:

process.on('unhandledRejection', (reason, promise) => {
  console.log('Unhandled Rejection at:', promise, 'reason:', reason)
  // Application specific logging, throwing an error, or other logic here
})
Enter fullscreen mode Exit fullscreen mode

But regardless, it needs to be handled. Especially if you're using newer version of Node. Depending on the version, newer versions won't just throw a warning, they will kill the server. And if you go the "top-level" route you may lose out on supplementing the error with other variables or information that are within the function's scope.

Retrying failed Promises

Another thing to consider if you are thinking about not waiting for Promise resolution is that if it does fail/reject, you need to add code to handle retrying the Promise (if you in fact want to retry it). Something like:

const retry = (fn, ms) => new Promise(resolve => { 
  fn()
    .then(resolve)
    .catch(() => {
      setTimeout(() => {
        console.log('retrying...')
        retry(fn, ms).then(resolve)
      }, ms)
    })
})

retry(someFnThatReturnsPromise, 2000)
Enter fullscreen mode Exit fullscreen mode

Of course if you don't care about the function/Promise rejecting, and can live with that, then you don't have to do this. But usually you're probably going to want that code to execute successfully.

The code above gets us Promise function retries, but what if the someFnThatReturnsPromise above keeps failing? Maybe there is a logic error or TypeError somewhere within the function definition. No number of retries are going to get it to successfully complete.

We can implement a maxNumberRetries in the retry() function, and that will stop the retries after X number of times. But we're still back to the issue that the code isn't completing successfully.
And those retries that happen are still in the event loop, using server processing power (back to point #1). What if you absolutely need those functions to complete and it's mission critical to your app?

Retrying those "permanent" failures becomes more difficult.

Also, in order to monitor these failures, we have to instrument the code to log out retries, number of attempts, etc. Again, that's doable, but it means more code to implement.
And unless you have something custom setup like a custom counter using statsd, Splunk, etc. to instrument and monitor the failures in some dashboard, you're probably going to just be logging the failures. And that means coming through logs to find the failures, or maybe setting up a CloudWatch query to watch for these failures.

Maybe a queue would make some of this simpler though? With less custom work you have to do on your end?

Depending on which queue solution you use, you usually get the following out of the box:

  • configurable retries
  • Dead letter queue ("DLQ")
  • queue monitoring/observability

Instead of adding custom retry code you usually get configurable "automatic" retries out of the box with a task queue solution.
In a scenario in which you get continual failures, that task can be automatically moved to a DLQ, where it will sit until you act on it. But will help you avoid an infinite retry loop.

Imagine you have some asynchronous code where a user signs up to your app, your code sends a welcome email out, creates credentials for them, and kicks off some marketing sequence. Maybe not super processing-intensive, but something you decide you don't wait to wait for (maybe your email provider is a bit slow, for example).
What if you pushed some bad processing code (i.e. your email-send code had a bug in it)? With a queue solutoin, you could make a fix, and then retry all these with the fixed code using the items from the DLQ.

And you'll also get observability into not just the DLQ - you want to know when code just won't successfully execute - but generally your others tasks too. Things like how many are currently in the queue, how many are processing, completed, etc.

The main point here is that you get these things out of the box (again most solutions should have these features but always make sure to check).

Infrastructure setup required for queue if not already setup

If you don't have the infrastructure already setup for a task queue, that is "overhead" work you or someone on your team will have to take care of. And obviously with more infrastructure comes more cost, so that's something to factor when you're looking at pricing/billing.

If you're building out a MVP, or can live with some code execution failures and less observability into the execution of that code, maybe the infrastructure setup is not worth it for you.
If you go with just not waiting for Promise resolution, the good thing is that solution is just application code. No queue setup, worker setup, etc.

A note on Lambdas

It's worth pointing out that if you're using AWS Lambdas and you don't await or .then() the Promise, you run the risk of that code "hijacking" and finishing its resolution within another Lambda request. I'm not an expert on Lambdas but I've personally seen this happen. A single Lambda was executing two different requests, with the part of one request that wasn't await'ed finishing in that Lambda run.
So the above discussion on Promises needs to be weighed against Lambda nuances.

Summary

I've gone through every consideration I can think of when determining if you should use a task queue or just skip Promise resolution and continue code execution.
But to end with a pseudo decision matrix for when you'd likely use which:

  • If processing (like image processing) is going to take several seconds or minutes, you should probably use a queue. It's likely too processing intensive for the server and you might end up with ancillary performance issues even though you're skipping resolution and continuing to the next bit of code.
  • If the task is not mission-critical and not processing intensive, and you can deal with some failures here and there, not waiting for Promise resolution is probably someFnThatReturnsPromise
    • The same goes for if you can live with continual failures (in the case of a programming bug related to the task)
  • If the task is mission-critical, even if it's not processing intensive, you should probably use a queue so you get observability, retries, and a DLQ (which again is really useful in case you had a programming bug)
  • If infrastructure setup is too much work for you, even given the above considerations, just don't wait for Promise resolution and don't use a queue
    • This might seem obvious but if you either can't setup the queue infrastructure or it's too much work, you're not going to have a queue anyways so can't use that solution.
    • If given your non-functional requirements and technical considerations you determine a task queue is right for your application though, I'd recommend biting the bullet and setting up the infrastructure.

The ability to work with asynchronous code in Node and JavaScript is great and obviously a core part of the language, but it can bring up some confusions too. Hopefully this discussion and explanation of the differences give you more of a nuanced understanding of the differences between the two approaches and helps you decide when to use which.

Love JavaScript but still getting tripped up by local dev, architecture, testing, etc? I publish articles on JavaScript and Node every 1-2 weeks, so if you want to receive all new articles directly to your inbox, here's that link again to subscribe to my newsletter!

💖 💪 🙅 🚩
ccleary00
Corey Cleary

Posted on August 5, 2022

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related