There is beauty in simplicity

Last week I finally worked on a test runner for Nodjes based on zora.
I had already written an article inspired by some of the zora's properties and I keep finding interesting how such a small project (in code size) can inspire me new subjects of discussion (I still have few in mind). This one will lead us through some fundamental concepts of Nodejs architecture and general computer programming such event loop, concurrency, parallelism, and how they can be related to the performances of a testing software.

A surprising benchmark

It all started when I added pta to the benchmark in the zora's repository. This benchmark tries to compare speed of execution for various testing frameworks. Performance is clearly at the center of the developer's experience and their productivity when it comes to testing software. Some of the popular frameworks have relative complex architectures involving abstractions such child processes to deliver (not only) top level performances. While zora is at the opposite quite simple but performs much faster according to the aforementioned benchmark.

How can it be ?

The benchmark consists in running N test files, each having M tests. One test would be the corresponding code with the different test runners syntaxes (if I did not make any mistake):

const wait = waitTime => new Promise(resolve => {
  setTimeout(()=>resolve(),waitTime); 
});

test('some test ', async function (assert) {
    await wait(WAIT_TIME); // wait time is a variable of the benchmark
    assert.ok(Math.random() * 100 > ERROR_RATE); // a given percentage of the tests should fail (eg ~3%) 
});

By changing N, M and WAIT_TIME we can mimic what I consider to be the profile of some typical Nodejs applications.

profile small library: N = 5, M = 8, T = 25ms
profile web app: N = 10, M = 8, T = 40ms
profile api: N =12, M = 10, T = 100ms

Each framework runs with its default settings.

Here are the results on my developer machine (MacBook Pro, 2.7GH i5) with node 12 :

	zora-3.1.0	pta-0.1.0	tape-4.11.2	Jest-24.9.0	AvA-2.4.0	Mocha-6.2.1
Library	~100ms	~230ms	~1240ms	~2835ms	~1888ms	~1349ms
Web app	~130ms	~280ms	~3523ms	~4084ms	~2900ms	~3696ms
API	~190ms	~330ms	~12586ms	~7380ms	~3900ms	~12766ms

We can even increase the differences if we use somehow extreme(?) values (N=100, T=10, WAIT_TIME=100ms)

zora	pta	tape	Jest	AvA	Mocha
~450ms	~750ms (1.6x slower)	~104sec (230x slower)	~43.1sec (96x slower)	~24.1sec (53x slower)	~104.5sec (230x slower)

As we will see, the results can actually be predictable, at least for some of the test runners.

The Event Loop and Nodejs's architecture

Nodejs' Javascript engine (like many others) is single threaded and is built around an event loop. There are already many resources online to grasp these two concepts (you can for example refer to the official Nodejs documentation) but to make it short it means:

The main process of a Nodejs program runs within a single thread.
Processing tasks are scheduled with a queue of events. These tasks can be anything like executing a statement, calling the next item of an iterator, resuming a suspended asynchronous function, etc.

The event system is particularly helpful for asynchronous operations as you do not have to block the main thread waiting for a task to complete. You would rather have to launch the asynchronous task and later, when it is over, the scheduler will be notified to enqueue another task: the execution of the callback.

Historically asynchronous tasks were made exclusively through event listeners called, due to their nature, "call me back" or "callback". In modern Nodejs there are newer built in abstractions you can use such async functions and promises or (async)iterators, (async)generator functions, etc. But in essence, the idea is the same: prevent the main thread from being blocked waiting.

Consider the following snippet:

(function fn(){
    console.time('fn timer 1');
    console.time('timer1');
    console.time('timer2');
    setTimeout(() => console.timeEnd('timer1') /* (B) */, 1000); // this won't block the main thread neither the function execution
    setTimeout(() => console.timeEnd('timer2') /* (C) */, 1000); // this won't block the main thread neither the function execution
    console.timeEnd('fn timer') // (A) this will called before the timer is executed
})();

The callbacks will execute after the function fn runs to its completion. The whole program will run in a bit more than 1000ms as the
setTiemout is not blocking: it just schedules on the event loop the execution of the callback function after some elapsed time.

The whole Nodejs architecture is based around these concepts. Let's take the example of a web API.

In a multi threading environment, a request would typically be handled by a thread from its parsing to the sending of the response.
It means once the request has been parsed and the database is processing the query the thread is paused waiting for the database to complete its work, eventually wasting processing resources. Later it is resumed to send the response made of the database result.
It implies you can roughly have as many concurrent requests as threads the server can manage at the same time.

In Nodejs as long as you don't block the event loop the server would be able to handle more requests even within its single thread. It is usually done by using one of the asynchronous patterns to deal with the costly tasks which need access to the disk, the network or any kernel operation. Most of the time, the often called "I/O" operation, is itself delegated to a process which leverage multi threading capabilities like a database server for instance.

Similarly than in our previous example and the setTimeout, the request handler does not have to block the event loop waiting for the database to complete its job, it just needs to pass a callback to execute once the database is done. It means the server can possibly handle a lot of concurrent requests with a single thread, being mostly limited by the database. In a sense, this architecture allows the system to avoid being idle and waste resources.

Concurrency

Concurrency is the ability of a program to start, execute, terminate tasks in an overlapping time. It does not mean the tasks have to run at the same time. It can refer to the ability to interrupt a task and allocate system resources to another task (context switching). Nodejs is a perfect example as you can reach very high concurrency with a single thread.

Now that we are familiar with the callback pattern, let's use async functions and promises instead.

const wait = (time = 1000) => new Promise(resolve => setTimeout(() => resolve(), time));

async function task(label){
    await wait();
    console.log(`task ${label} is done`);
}

The task function may appear to block the main thread but it is not the case. The await statement allows indeed to suspend its execution for a while but it does not prevent the main thread from running another task.

const run = async () => {
    console.time('exec');
    const p1 = task(`task 1`);
    const p2 = task(`task 2`);
    await p1;
    await p2;
    console.timeEnd('exec');
};

// or if it makes more sense

const run = async () => {
    console.time('exec');
    const tasks = [task(`task 1`), task(`task 2`)];
    await Promise.all(tasks);
    console.timeEnd('exec');
};

run();

The last program will run in something close to 1000ms whereas a single task function itself takes 1000ms to run. We were able to execute the two tasks concurrently.

Parallelism

Now let's consider the following function:

// async function is not mandatory here, but it emphases the point.
async function longComputation() {
    console.log(`starts long computation`);
    let sum = 0;
    for (let i = 0; i < 1e9; i++) {
        sum += i;
    }
    console.log(`ends long computation`);
    return sum;
}

This function takes close to 1s to return its result on my machine. But contrary to the task function, longComputation whose code is all synchronous blocks the main thread and the event loop by monopolising the CPU resources given to the thread. If you run the following program

const run = async () => {
    console.time('exec');
    const p1 = longBlockingComputation();
    const p2 = longBlockingComputation();
    await p1;
    await p2;
    console.timeEnd('exec');
};

run();

It will take close to 2s (~1s + ~1s) to complete and the second task won't start before the first one is finished. We were not able to run the two tasks concurrently.

In practice, writing such code is a very bad idea and you would rather delegate this task to another process able to take advantage of parallelism.

Parallelism is the ability to run different tasks literally at the same time. It usually involves running multiple threads with different CPU cores.

Well, actually even with Nodejs you can run multiple threads (or child processes). Let's see an example with the newer Worker Threads API;

worker.js

const {
    parentPort
} = require('worker_threads');

function longComputation() {
    let sum = 0;
    for (let i = 0; i < 1e9; i++) {
        sum += i;
    }
    return sum;
}

parentPort.postMessage(longComputation());

and the main program

const {
    Worker,
} = require('worker_threads');

const longCalculation = () => new Promise ((resolve, reject) => {
    const worker= new Worker('./worker.js');
    worker.on('message',resolve);
    worker.on('error', reject);
});

const run = async () => {
    console.time('exec');
    const p1 = longCalculation();
    const p2 = longCalculation();
    await p1;
    await p2;
    console.timeEnd('exec');
};

run();

Great! This has run in roughly 1000ms. It is also interesting how we have shifted back to the paradigm of the previous section with non blocking functions.

Note: attentive readers will have spotted that the longCalculation creates a new thread worker with each invocation. In practice you would rather use a pool of workers.

How is this related to our testing frameworks ?

As mentioned, speed is a must for the developer experience. Being able to run tests concurrently is therefore very important. On the other hand
it enforces you to write independent tests: if you run tests concurrently you do not want them to mess up some shared data. It is often a good practice but sometimes you need to maintain some state between tests and run various tests serially (one starts when the previous is finished). This can make the design of a testing software API quite challenging...

Let's now try to explain the result we had for our "extreme" case:

Mocha and Tape run test files and tests within a file serially so they will roughly last N * M * WAIT_TIME ~= 100 * 10 * 0.1s ~= 100s (this is consistent)
I can see from the progress in the console that AVA is likely running 4 tests files in parallel on my machine. I think from the documentation that within a file the tests should run concurrently (so that the whole test suite would run roughly in N/4 * WAIT_TIME ~= 25 x 0.1 ~= 2.5s ) but there might be extra cost managing the four child processes (or workers ?) because it is 10 times slower than the expected result.
Jest seems to run 3 test files in parallel on my machine and the tests within a file serially. So I expected N/3 * M * WAIT_TIME ~= 33 * 10 * 0.1 ~= 33s but yet it is slower. Again managing child processes is clearly not free.
Zora and pta run every test concurrently so we can expect the execution time to be related to the slowest test. In practice it takes some time to launch Nodejs, parse the scripts and require the modules. This can explain the little extra time. But the results stay steadily below the second whatever test profile we run.

A small zora

Let's build a small zora to understand how it works (and achieve a high concurrency) and how it tackles the problems mentioned in the introduction of the previous section.

We can write a testFunction function as so:

// test.js
const testFunction = module.exports = (description, specFunction, testList) => {
    let error = null;
    let passing = true;
    const subTestList = [];
    // we return the routine so we can explicitly wait for it to complete (serial tests)
    const subTest = (description, fn) => testFunction(description, fn, subTestList).execRoutine; 

    // eagerly run the test as soon as testFunction is called
    const execRoutine = (async function () {
        try {
            await specFunction({test: subTest});
        } catch (e) {
            passing = false;
            error = e;
        }
    })();

    const testObject = Object.defineProperties({
        // we **report** test result with async iterators... in a non blocking way
        [Symbol.asyncIterator]: async function* () {
            await execRoutine;
            for await (const t of subTestList) {
                yield* t;// report sub test
                passing = passing && t.pass; // mark parent test as failing in case a subtest fails (but don't bubble the error)
            }
            yield this; // report this test
        }
    }, {
        execRoutine: {value: execRoutine},
        error: {
            get() {
                return error;
            }
        },
        description: {
            value: description
        },
        pass: {
            get() {
                return passing;
            }
        }
    });

    // collect the test in the parent's test list
    testList.push(testObject);

    return testObject;
};

and the test harness factory as so

// run.js
const testFunction = require('./test.js');
const reporter = require('./reporter.js');

const createHarness = () => {
    const testList = [];
    const test = (description, spec) => testFunction(description, spec, testList);

    return {
        test,
        async report() {
            for (const t of testList) {
                for await (const a of t) {
                    reporter(a);
                }
            }
        }
    };
};

const defaultTestHarness = createHarness();

// automatically start to report on the next tick of the event loop
process.nextTick(() => defaultTestHarness.report());

module.exports = defaultTestHarness;

The (dummy)reporter being:

// reporter.js
module.exports = testResult => {
    const isFailed = testResult.pass === false;
    console.log(`${!isFailed ? 'ok' : 'no ok'} - ${testResult.description}`);
    if (testResult.error) {
        console.log(testResult.error.stack);
        if (testResult.error.operator) {
            console.log(`operator: ${testResult.error.operator}`);
        }
        if (testResult.error.expected) {
            console.log(`expected: \n ${JSON.stringify(testResult.error.expected, null, 4)}`);
        }
        if (testResult.error.actual) {
            console.log(`actual: \n ${JSON.stringify(testResult.error.actual, null, 4)}`);
        }
    }
};

That's it! You have a whole testing library within less than 100 lines of source code which can use whatever assertion library as long as it throws an error (the assert module from Nodejs' core is a good candidate !).

It will report failures: "where?", "what?" and "why?"

const assert = require('assert').strict;
const {test} = require('./run.js');

test(`some test`, () => {
    assert.deepEqual([1, 2, 3], [1, 2, 4], `array should be equivalent`);
});

will output:

It will run every test concurrently and will likely be faster than all the other mega bytes sized test runners

test(`some async test that shows concurrency`, async t => {

    let foo = 'bar';

    t.test(`nested async`, async t => {
        await wait(100);
        assert.equal(foo, 'baz', 'see changed value although started before');
        foo = 'whatever'
    });

    t.test(`change foo faster`, t=>{
        assert.equal(foo, 'bar');
        foo = 'baz';
    })

});

Yet it will allow you to control the concurrency of you test with regular javascript control flows

test(`some serial test`, async t => {
    let foo = 'bar';

    // we specifically wait for that test to complete with the "await" keyword ...
    await t.test('nested inside', async t => {
        await wait(100);
        assert.equal(foo, 'bar', 'see the initial value of foo');
        foo = 'whatever';
    });

    // to start this one
    t.test('run only once "nested inside" has finished', () => {
        assert.equal(foo, 'whatever', 'see the changed value');
    });

});

If you wish to play with this basic test runner, you can fork the following gist and run the test program with node: node test_program.js

Conclusion

We have reviewed Nodejs' architecture and saw how it can allow high concurrency without necessarily involving parallelism. We have placed it in the context of a testing software and saw how we could give a high quality user experience to the developer and greatly improve their productivity.

We can also discuss whether parallelism has an added value in the context of Nodejs testing experience. We already saw that it may not be the case regarding the performances. Of course you could find some use cases where parallelism could bring you better performances. Or you could argue the test function in the benchmark is not "blocking enough" to be realistic (you would be right!) but as we said earlier, if you need parallelism in your tests because the code you are testing is slow, you are probably doing it wrong.

In practice I have personally been using zora (or pta) for a wide range of use cases and never had any performance issue:

In ship-hold, we run a whole range of integration tests against a database server below a second.
In mapboxgl-webcomponent, we run browser automation (screen shots capture, etc) within few seconds (this might actually be considered slow).
In smart-table, we run many unit tests in a second.
pta is tested by itself and the test suite contains child processes to run pta's CLI as a binary, all this in less than 2 seconds.

On the other hand, child processes have other interesting properties from a testing perspective, naming isolation. It allows you to run a given set of tests in an isolated, sand boxed environment.
However, it also leaves you with few new issues to address (stream synchronisation, exit codes, etc) making the code base inevitably grow. I would not say AVA is minimal(14.8mb), neither is Jest(32mb). Of course they offer way more "features" than our few bytes test runner. But are "runs previously failed tests first" or "re-organizes runs based on how long test files take" really required when a whole test suite runs within a pair of second.

The title refers to our ability, as developers, to sometimes over engineer solutions where simplicity is just what we need.

Blog