The module system

manojpatra1991

Manoj Kumar Patra

Posted on October 2, 2024

The module system

# Why are modules required?

  • Having a way to split the codebase into multiple files.
  • Allowing code reuse across different projects.
  • Encapsulation (or information hiding).
  • Managing dependencies.

Module vs module system

💡 Module is the actual unit of software, while a module system is the syntax and the tooling that allows us to define modules and to use them within our projects.

ECMAScript 6

ECMAScript 6 defined only the formal specification for ESM in terms of syntax and semantics, but it didn't provide any implementation details.

Browser companies and the Node.js community were responsible for implementing the ESM.

Node.js ships with stable support for ESM starting from version 13.2.

# The revealing module pattern


const myModule = (() => {
  const privateFoo = () => {};
  const privateBar = [];
  const exported = {
    publicFoo: () => {},
    publicBar: () => {},
  };
  return exported;
})();

Enter fullscreen mode Exit fullscreen mode

Here, we use an Immediately Invoked Function Expression (IIFE) to create a private scope, exporting only the parts that are meant to be public.

# CommonJS modules

  • require is a function that allows you to import a module from the local filesystem
  • exports and module.exports are special variables that can be used to export public functionality from the current module

Module loader implementation


function loadModule(filename, module, require) {
  const wrappedSrc = `(function (module, exports, require) {
      ${fs.readFileSync(filename, "utf8")}
    })(${module}, ${module.exports}, ${require})`;
  eval(wrappedSrc);
}

Enter fullscreen mode Exit fullscreen mode

Here, we are using readFileSync to read the module's content. While it is generally not recommended to use the synchronous version of the filesystem APIs, here it makes sense to do so. The reason for that is that loading modules in CommonJS are deliberately synchronous operations. This approach makes sure that, if we are importing multiple modules, they (and their dependencies) are loaded in the right order.

require implementation


function require(moduleName) {
  console.log(`Require invoked for module: ${moduleName}`);
  const id = require.resolve(moduleName);
  if (require.cache[id]) {
    return require.cache[id].exports;
  }
  // module metadata
  const module = {
    exports: {},
    id,
  };
  // Update the cache
  require.cache[id] = module; // load the module
  loadModule(id, module, require); // return exported variables
  return module.exports;
}

require.cache = {};
require.resolve = (moduleName) => {
  /* resolve a full module id from the moduleName */
};

Enter fullscreen mode Exit fullscreen mode

require() function is synchronous.

Everything inside a module is private unless it's assigned to the module.exports variable. The content of this variable is then cached and returned when the module is loaded using require().

module.exports vs exports

  • The exports variable is just a reference to the initial value of module.exports.
  • We can only attach new properties to the object referenced by the exports variable.
  • Reassigning the exports variable doesn't have any effect, because it doesn't change the content of module.exports. It will only reassign the variable itself.
  • If we want to export something other than an object literal, such as a function, an instance, or even a string, we have to reassign module.exports.

# The resolving algorithm

  • File modules:
    • moduleName starts with / => absolute path to the module and returned as it is.
    • moduleName starts with ./ => a relative path, which is calculated starting from the directory of the requiring module.
  • Core modules: moduleName not prefixed with / or ./ => the algorithm will first try to search within the core Node.js modules.
  • Package modules: If no core module is found matching moduleName, then the search continues by looking for a matching module in the first node_modules directory that is found navigating up in the directory structure starting from the requiring module. The algorithm continues to search for a match by looking into the next node_modules directory up in the directory tree, until it reaches the root of the filesystem.
  • For file and package modules, both files and directories can match moduleName as follows:
    • <moduleName>.js
    • <moduleName>/index.js
    • The directory/file specified in the main property of <moduleName>/package.json

The module cache

Each module is only loaded and evaluated the first time it is required, since any subsequent call of require() will simply return the cached version.

The module cache is exposed via the require.cache variable.

Circular dependencies



// Module a.js
exports.loaded = false;
const b = require("./b");
module.exports = {
  b,
  loaded: true,
};

// Module b.js
exports.loaded = false;
const a = require("./a");
module.exports = {
  a,
  loaded: true,
};

// main.js
const a = require("./a");
const b = require("./b");
console.log("a ->", JSON.stringify(a, null, 2));
console.log("b ->", JSON.stringify(b, null, 2));

Enter fullscreen mode Exit fullscreen mode

# Module definition patterns

Named exports


// module.js
exports.info = (message) => {
  console.log(`info: ${message}`);
};

// app.js
const logger = require("./logger");
logger.info("This is an informational message");

Enter fullscreen mode Exit fullscreen mode

Exporting a function


// module.js
module.exports = (message) => {
  console.log(`info: ${message}`);
};

module.exports.verbose = (message) => {
  console.log(`verbose: ${message}`);
};

// app.js
const logger = require("./logger");
logger("This is an informational message");
logger.verbose("This is a verbose message");

Enter fullscreen mode Exit fullscreen mode

Exporting a class


class Logger {
  constructor(name) {
    this.name = name;
  }
  log(message) {
    console.log(`[${this.name}] ${message}`);
  }
  info(message) {
    this.log(`info: ${message}`);
  }
  verbose(message) {
    this.log(`verbose: ${message}`);
  }
}
module.exports = Logger;

Enter fullscreen mode Exit fullscreen mode

Exporting an instance


class Logger {
  constructor (name) {
    ...
  }
  log (message) {
    ...
  }
}
module.exports = new Logger('DEFAULT');

Enter fullscreen mode Exit fullscreen mode

NOTE: Multiple instances can still be created using the constructor property as follows:


const customLogger = new logger.constructor('CUSTOM');
customLogger.log('This is an informational message');

Enter fullscreen mode Exit fullscreen mode

Monkey patching (Modifying other modules on the global scope)


// patcher.js
require('./logger').customMessage = function () {
  console.log('This is a new functionality');
};

// app.js
require('./patcher');
const logger = require('./logger');
logger.customMessage();

Enter fullscreen mode Exit fullscreen mode

We should avoid using this technique as much as possible.

A real-life use case for this technique would be to mock the http module when writing test cases so that it provides mocked responses instead of a real HTTP request. Example: nock module.

# ESM: EcmaScript Modules

Difference between ESM and CommonJS modules

ESMs are static.

  • import statements are described at the top level of every module and outside any control flow statement.

  • The name of the imported ESMs cannot be dynamically generated at runtime using expressions, only constant strings are allowed.

Benefits of static imports

  • static analysis of the dependency tree
  • dead code elimination (tree shaking)

Using ESM in Node.js

  • .mjs file extension
  • In package.json, add field type with value module

Named exports and imports

In an ESM, everything is private by default and only entities exported with the export keyword are publicly accessible from other modules.


export function log (message) {
  console.log(message);
}

Enter fullscreen mode Exit fullscreen mode

Namespace import


import * as loggerModule from './logger.js';

Enter fullscreen mode Exit fullscreen mode

When we use this type of import statement, the entities are imported into the current scope, so there is a risk of a name clash.


import { log, Logger } from './logger.js';

Enter fullscreen mode Exit fullscreen mode

To resolve this clash,


import { log as log2 } from './logger.js';

Enter fullscreen mode Exit fullscreen mode

With ESM, it is important to specify the file extension for the imported module.

Default exports and imports

To export a single unnamed entity like in CommonJS with module.exports, we can do the following:


// The entity exported is registered under the name `default`
export default class Logger {
  ...
}

// Import
import MyLogger from './logger.js';

Enter fullscreen mode Exit fullscreen mode

❓ What happens if we do import { default } from './logger.js'; ??

The execution will fail with a SyntaxError: Unexpected reserved word error. This happens because the default keyword cannot be used as a variable name directly in the scope. However, we can do import * as loggerModule from './logger.js';.

Named exports are better than default export

  • Named exports are explicit and thus, allows IDEs to support the developer with automatic imports, autocomplete, and refactoring tools.
  • Default exports might make it harder to apply dead code elimination (tree shaking).

Module identifiers

  • Relative specifiers like ./logger.js or ../logger.js
  • Absolute specifiers like file:///opt/nodejs/config.js
  • Bare specifiers - modules in the node_modules folder
  • Deep import specifiers - refer to a path within a package in node_modules

Async imports (Dynamic imports)

Async imports can be performed at runtime using the special import() operator.


import(dynamicModule)
  .then((res) => ...)

Enter fullscreen mode Exit fullscreen mode

ESM loading process

  • Create a dependency graph - the dependency graph is needed by the interpreter to figure out how modules depend on each other and in what order the code needs to be executed. The starting point for the dependency resolution is called the entry point.
    • Phase 1 - Construction (or parsing)
    • Find all the imports and recursively load the content of every module from the respective file. This is done in a depth-first manner. Modules already visited in this process are ignored on revisits.
    • Phase 2 - Instantiation
    • The interpreter walks the tree view obtained from the previous phase from the bottom to the top. For every exported entity in each module, it keeps a named reference in memory, but doesn't assign any value just yet. After this, the interpreter will do another pass to link the exported names to the modules importing them.
    • Phase 3 - Evaluation
    • Node.js finally executes the code so that all the previously instantiated entities can get an actual value. The execution order is bottom-up respecting the post-order depth-first visit of our original dependency graph.

Loading process difference between CommonJS and ESM

CommonJS will execute all the files while the dependency graph is explored whereas in ESM, no code can be executed until the dependency graph has been fully built, and therefore module imports and exports have to be static.

Read-only live bindings

In ESM, when an entity is imported in the scope, the binding to its original value cannot be changed (read-only binding) unless the bound value changes within the scope of the original module itself (live binding), which is outside the direct control of the consumer code.

How is this different from CommonJS?

In CommonJS, the entire exports object is copied (shallow copy) when required from a module. This means that, if the value of primitive variables like numbers or string is changed at a later time, the requiring module won't be able to see those changes.

In ESM, imported modules are tracked as references, so, we can be sure every module has an up-to-date picture of the other modules, even in the presence of circular dependencies.

Monkey patching with ESM


import fs from 'fs';

export function mockEnable (respondWith) {
  mockedResponse = respondWith;
  fs.readFile = () => "Hello, world!";
}

Enter fullscreen mode Exit fullscreen mode

The following statements won't work as they would give read-only live bindings:


import * as fs from 'fs';
import { readFile } from 'fs';

Enter fullscreen mode Exit fullscreen mode

Another approach would be to use syncBuiltinESMExports:


import fs, { readFileSync } from 'fs';
import { syncBuiltinESMExports } from 'module';

fs.readFileSync = () => Buffer.from('Hello, ESM');
syncBuiltinESMExports();

console.log(fs.readFileSync === readFileSync); // true

Enter fullscreen mode Exit fullscreen mode

syncBuiltinESMExports works only for built-in Node.js modules.

# More differences between ESM and CommonJS modules

  • ESM runs in strict mode, unlike CommonJS, where we need to add "use strict" statements at the beginning of the file.

  • require, exports, module.exports, __filename, and __dirname are not defined in ESM.

To get the values for the same, we can do the following:


// Create __filename and __dirname
import { fileURLToPath } from 'url';
import { dirname } from 'path';

// import.meta.url is a reference to the current module
// format: file:///path/to/current_module.js
const __filename = fileURLToPath(import.meta.url);
const __dirname = dirname(__filename);

// Create require
import { createRequire } from 'module';
const require = createRequire(import.meta.url);

Enter fullscreen mode Exit fullscreen mode
  • this is undefined in the global scope of ESM, whereas in CommonJS this refers to exports.

  • ESM cannot import JSON files directly as modules. To do this, use the create require approach as mentioned above.

💖 💪 🙅 🚩
manojpatra1991
Manoj Kumar Patra

Posted on October 2, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related

The Node.js Platform
node The Node.js Platform

October 2, 2024

The module system
node The module system

October 2, 2024

The module system
node The module system

October 2, 2024

Callbacks and Events
node Callbacks and Events

October 2, 2024

Coding with Streams
node Coding with Streams

October 2, 2024