An Introduction to Multithreading in Node.js
Kayode
Posted on August 3, 2022
Computers are becoming more powerful, thanks to GPUs and multi-core CPUs. Equally, applications are getting more complex as they leverage threads (independent execution units in a process) for maximum application performance and responsiveness.
In this article, we will explain what multithreading is, and how Node.js handles asynchronous operations using the event loop and worker pools. We'll also discuss how to use the Node.js worker-threads
module to create and manage threads.
Let's get started!
The History of Node.js Async Event-Driven Runtime
JavaScript is, at its base, a synchronous, blocking, single-threaded language.
It was initially created to run on web browsers, allowing for web page interactions, form validations, and animations.
But some operations on a browser may take a longer time to run. Running operations on a single thread can block the synchronous execution flow and result in unresponsive UI interactions.
So JavaScript can be made asynchronous — meaning that we can run those long-running operations in parallel but not create or sync threads.
The creator of Node.js, Ryan Dahl, made Node.js to avoid using threads, as outlined in the Node.js documentation:
Thread-based networking is relatively inefficient and very difficult to use. Furthermore, users of Node.js are free from worries of dead-locking the process, since there are no locks.
Almost no function in Node.js directly performs I/O, so the process never blocks except when the I/O is performed using synchronous methods of the Node.js standard library. Because nothing blocks, scalable systems are very reasonable to develop in Node.js.
So, What Is Multithreading in Node.js?
Multithreading is a program execution model that allows multiple threads to be created within a process. The threads execute independently but concurrently share process resources.
Original image source: Wikimedia Commons
To understand multithreading, we need to know how a single-thread process looks.
Imagine we have a set of four instructions. If we run the set of instructions in a simple single-threaded process, the execution looks like this:
Each operation has to wait for the preceding operation to execute, even if they block the execution flow.
But in a multithreaded process, instructions can run concurrently in different threads:
Is Node.js Single-Threaded?
Node.js is single-threaded, except when it is not. In the end, if you use Node.js, you will probably use more than a single thread.
Let's say you want to read data from a database or do some file operations. By using a single thread, these operations can prevent other operations from running. So when Node.js encounters these operations, it delegates them to a separate pool of threads managed by a C library known as libuv.
Node.js is single-threaded at its base, but we can run some operations in parallel. We do not create threads that share the same 'context', though.
Running Parallel Child Processes in Node.js
We spin up a child process using Node’s child_process
module. The spun-up child processes or subprocesses can communicate through a messaging system. They run separately, allowing you to divide and run your application script from different processes.
A child_process
provides four different ways to create a child: spawn()
, exec()
, execFile()
, and fork()
.
Let's do a quick demonstration using the fork()
method.
The fork()
method allows you to create a child process that’s connected to the main process currently running your code. It accepts the following three parameters:
- A module path
string
for a JavaScript file to execute on the child process (required) - An
array
ofstring
s to pass as the child processes' arguments - The options
object
to pass to the child process
fork("sub.js", ["arguments"], { cwd: process.cwd() });
Let’s create the main.js
file, import the child_process
module, and create a child process from a fork.
// main.js
const child_proc = require("child_process");
console.log("running main.js");
const sub = child_proc.fork("./sub.js");
// sending message to subprocess
sub.send({ from: "parent" });
// listening to message from subprocess
sub.on("message", (message) => {
console.log("PARENT got message from " + message.from);
sub.disconnect();
});
Then we'll create a subprocess file — sub.js
— in the same directory as main.js
:
// sub.js
console.log("sub.js is running");
setTimeout(() => {
// subprocess sending message to parent
process.send({ from: "client" });
}, 2000);
// subprocess listening to message from parent
process.on("message", (message) => {
console.log("SUBPROCESS got message from " + message.from);
});
Run main.js
, which will print this in your terminal:
running main.js
sub.js is running
SUBPROCESS got message from parent
PARENT got message from client
What we've done here is called multiprocessing. It’s different from multithreading because we are creating more processes.
In multithreading, a single process can have multiple code segments (threads) that run concurrently within the process.
In multiprocessing, the creation of a process is slow and resource-specific. In multithreading, however, it's economical to create a thread.
What Are Worker Threads?
Worker threads can run CPU-intensive JavaScript operations without blocking the event loop from running. Unlike child_process
, worker_threads
can share memory by transferring ArrayBuffer
instances or sharing SharedArrayBuffer
instances.
How to Use Worker Threads in Node.js
worker_threads
became available in Node.js 10.5.0. Before this version, you couldn't access the module unless you ran the Node.js program using the --experimental-worker
flag.
$ node app.js --experimental-worker
Note: Make sure you keep in mind this advice about worker threads from the Node.js documentation:
Workers (threads) are useful for performing CPU-intensive JavaScript operations. They do not help much with I/O-intensive work. The Node.js built-in asynchronous I/O operations are more efficient than Workers can be.
Let’s create a simple example where we have a main file, make a worker thread from another file, and give the thread some data.
First, we’ll create the main file, main.js
.
const { Worker } = require("worker_threads");
function doSomethingCPUIntensive(name) {
return new Promise((resolve, reject) => {
const worker = new Worker("./sub.js", { workerData: { name } });
worker.on("message", resolve);
worker.on("error", reject);
worker.on("exit", (code) => {
if (code !== 0) {
reject(new Error(`stopped with exit code ${code}`));
}
});
});
}
(async () => {
try {
const result = await doSomethingCPUIntensive("John");
console.log("Parent: ", result);
} catch (err) {
console.log(err);
}
})();
We create a worker by passing in the path to a file as the first argument and data as the second argument (the data passed is a clone, so we cannot refer to it from the worker thread).
Then we can listen to a series of events from the worker and act accordingly. For instance, if the worker thread is stopped, we can derive the exit code
.
Next, we create a worker thread module script which, in our case, will be called sub.js
:
// sub.js
const { workerData, parentPort } = require("worker_threads");
// you can do intensive sychronous stuff here
function theCPUIntensiveTask(name) {
return `Hello World ${name}`;
}
const intensiveResult = theCPUIntensiveTask(workerData.name);
parentPort.postMessage({ intensiveResult });
workerData
receives data that's passed when the worker is created, and parentPort
provides a method to return the result of theCPUIntensiveTask
.
The worker thread is a great tool to run CPU-intensive operations, and can get much more complex than in the simple example above.
If you are running a Node.js version older than Node.js 11.7, use the --experimental-worker
flag.
$ node --experimental-worker main.js
Running the script prints this result:
Parent: { intensiveResult: 'Hello World John' }
Check out the Node.js documentation for more on worker threads.
Wrap Up
In this article, we explored the history of Node.js asynchronous event runtime before explaining the basics of multithreading. We then looked at running parallel child processes and how to use worker threads in Node.js.
Even though Node doesn’t traditionally support multithreading, worker threads provide a nice workaround (without the potential errors of race conditions common in threads).
We hope this post has given you a good grounding in Node.js worker threads.
Happy coding!
P.S. If you liked this post, subscribe to our JavaScript Sorcery list for a monthly deep dive into more magical JavaScript tips and tricks.
P.P.S. If you need an APM for your Node.js app, go and check out the AppSignal APM for Node.js.
Posted on August 3, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
November 30, 2024