Node.js fork is not what you think!

pi0

Pooya Parsa

Posted on January 24, 2019

Node.js fork is not what you think!

Today I realized that in Node.js, neither cluster.fork or child_process.fork act like something you expect in a C environment. Actually, it is shortly mentioned in docs:

Unlike the fork(2) POSIX system call, child_process.fork() does not clone the current process.

The child_process.fork() method is a special case of child_process.spawn() used specifically to spawn new Node.js processes. Like child_process.spawn(), a ChildProcess object is returned. The returned ChildProcess will have an additional communication channel built-in that allows messages to be passed back and forth between the parent and child.

So what it means?

Taking a simple C code that forks 5 processes:

#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>

int main()
{
   printf("Shared Code\n");

   for (int i = 0; i < 5; i++)
   {
      int pid = fork();
      if (!pid)
      {
       return printf("Worker %d\n", i+1);
      }
   }

   return 0;
}

Compiling and running this code gives us a result like this:

Shared
Worker 1
Worker 2
Worker 3
Worker 4
Worker 5

What operating system does under the hoods, is when we call fork(), it copies entire process state to a new one with a new PID. Return value in the worker process is always 0 so we have a way to find-out if rest of the code is running in forked process or master. (Thanks to @littlefox comment🧡)

The important point is that forked process continues from where fork() was called. Not from the beginning so Shared is printed once.

Running a similar code in Node.js:


const { fork, isMaster } = require('cluster')

console.log('Shared')

if (isMaster) {
  // Fork workers.
  for (let i = 0; i < 5; i++) {
    fork();
  }
} else {
  // Worker
  console.log(`Worker`);
}

The output is amazingly different:

Shared
Shared
Worker
Shared
Worker
Shared
Worker
Shared
Worker
Shared
Worker

The point is that each time a worker forked, it started with a fresh V8 instance. This is not a behavior that it's name tells. Fork in Node.js is actually doing exec/spawn which causes shared code running each time.

OK. So let's move console.log('Shared') into if (isMaster) :P

Well. Yes. You are right. That's the solution. But just for this example case!

In a real-world application that needs a cluster, we don't immediately fork workers. We may want to set up our web framework, parse CLI args and require a couple of libs and files. All of this steps has to be repeated on each worker that may introduce lots of unnecessary overhead.

Final Solution

Now that we know what exactly cluster.fork does under the hood, we can split our worker logic into a seperate worker.js file and change default value of exec which is process.argv[1] to worker.js :) This is possible by calling cluster.setupMaster() on master process.

💖 💪 🙅 🚩
pi0
Pooya Parsa

Posted on January 24, 2019

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related

Learn Cron Jobs in 10 minutes
webdev Learn Cron Jobs in 10 minutes

September 27, 2024

Preparar servidor para deploy NodeJs
javascript Preparar servidor para deploy NodeJs

September 6, 2020

Node.js fork is not what you think!
javascript Node.js fork is not what you think!

January 24, 2019