JS MODULE LOADERS; or, a brief journey through hell
Brian Kirkpatrick
Posted on August 14, 2022
Introduction
There's a saying in defense circles: "amateurs talk strategy; professionals talk logistics". In other words, what seems like the most mundane element of complex engineering tasks (moving stuff on time from point A to point B) is a surprisingly critical element of success.
If I had to force an analogy here, I'd say for the developer community that "amateurs talk code, professionals talk integration". It turns out that writing code (especially from scratch) is surprisingly easy, whereas putting code together (especially code you didn't write yourself) is surprisingly difficult.
So, in the world of JavaScript, how do we put code together? Well, it depends. In the year of our lord two-thousand and twenty-two, 26 years after JavaScript was released, we still don't have a consistent way to integrate units of code together. We don't even have a consistent way to define what those units of code are!
The Problems
You'll note the word "consistent", though. There are many ways you could go about it, but few ways that are truly interoperable. Let's break this into three specific problems:
How are packages managed?
How are modules exported?
How are modules specified?
For example, the answer to #1 could be NPM, Yarn, or some kind of CDN. It could also be as simple as git submodules. (For reasons I won't dive too deeply into, I prefer the latter approach, in particular because it is completely decoupled from the module you are developing--and even the language you are developing in.)
The answer to #2 could be something like AMD/RequireJS modules, or CommonJS/Node, or browser-level script tags within a global scope (yuck!). Of course, Browserify or WebPack could help you here if you're really a big fan of the latter. I'm a big fan of AMD/RequireJS but there's no arguing that being able to run (and test) a codebase from the command line (locally or remotely) is HUGELY advantageous, both for development (just messing around) and for deployment (e.g., automated testing from a CI job).
The answer to #3 is a little more subtle, in no small part because with something like CommonJS/Node it's entirely implicit. With AMD/RequireJS, you have specific "require", "exports", and "module" parameters to a "define()" function. These exist in CommonJS/Node, too, but they're implied. Try printing "module" to console.log sometime and look at all the juicy details you've been missing.
SFJMs and UMD
But this doesn't include the contents of your package.json (if any) and even with AMD/RequireJS there's no specific standard for attaching metadata and other module properties. That's one reason I put together the SFJM standard in a previous dev.to article:
https://dev.to/tythos/single-file-javascript-modules-7aj
But regardless of your approach, the module loader (e.g., export problem outlined in #2 above) is going to be sticky. That's one reason the UMD standard has emerged, for which there is an excellent writeup by Jim Fischer:
https://jameshfisher.com/2020/10/04/what-are-umd-modules/
UMD specifies a header to be pasted in front of your define-like closure. It's used by a few major libraries, including support for certain build configurations, like THREE.js:
https://github.com/mrdoob/three.js/blob/dev/build/three.js
The Header
The UMD header has several variations but we'll consider the following one from Jim Fischer's writeup:
// myModuleName.js
(function (root, factory) {
if (typeof define === 'function' && define.amd) {
// AMD. Register as an anonymous module.
define(['exports', 'b'], factory);
} else if (typeof exports === 'object' && typeof exports.nodeName !== 'string') {
// CommonJS
factory(exports, require('b'));
} else {
// Browser globals
factory((root.myModuleName = {}), root.b);
}
}(typeof self !== 'undefined' ? self : this, function (exports, b) {
// Use b in some fashion.
// attach properties to the exports object to define
// the exported module properties.
exports.action = function () {};
}));
There are effectively three use cases captured here: AMD/RequireJS; CommonJS/Node; and browser globals. Let's be honest, though--it's ugly. (This isn't a hack at Jim, this is a general UMD problem.) Among other things, here's what bugs me:
It's just plain bulky--that's a lot of text to paste at the top of every module
It actually tries too hard--I've never found a need to support browser globals, I just need my AMD/RequireJS-based single-file JavaScript modules to be able to run/test in a CommonJS/Node environment
The dependency listings are explicitly tied into the header--so it's not actually reusable. You have to customize it for every module! Compare this to simply specifying
const b = require('b');
within the closure factory itself and clearly there's a big difference.I'm not interested in treating usecases equally. I'm writing in AMD/RequireJS, and capturing CommonJS/Node loading is the edge case.
The main problem here with the last point is, AMD/RequireJS already give us a very clean closure and explicitly module definition interface. It's CommonJS/Node that require the hack. So, can we streamline the header and focus on adapting the latter to the former? Preferably in a way that is agnostic to dependencies? Well, since I'm writing this article, you can probably tell the answer is "yes".
My Approach
Let's start with symbols. What's available, and what isn't? Let's start with a AMD/RequireJS module already defined and working. If you put yourself in the mind of the CommonJS/Node interpreter, the first thing you'll realize is that, while "require", "exports", and "module" are already defined implicitly, the "define" factory is not. So, this is the root of our problem: we need to define a "define" (ha ha) factory that guides CommonJS/Node to interpret the module definition closure in a consistent way.
There's a good example of the conditional for this from UMD that we can borrow (and adjust slightly):
if (typeof(define) !== "function" || define.amd !== true) {
Interestingly, you can't just check to see if define exists. You need to make sure it doesn't actually exist AS THE AMD IMPLEMENTATION, because CommonJS/Node may retain the "define" symbol outside of this context--for example, in the scope of another module that is "require()"-ing this one. Bizarre, but true.
So, now our goal is to define "define()". How can this be adapted to a CommonJS/Node scope? What we need to ensure is, the existence of an identical "define()" interface:
It should take a single parameter, an anonymous function (which we will call the "factory" here) within whose closure the module contents are defined.
That function should have the following interface: "require" (a function that resolves/returns any module dependencies based on path); "exports" (an Object that defines what symbols will be available to external modules); and "module" (a definition of module properties that includes "module.exports", which points to "exports".
Define should call that function and return the export symbols of the module. (In the case of a SFJM-compatible definition, this will also include package.json-like module metadata, including a map of dependencies.)
The last point is interesting because a) there's already multiple references to the module exports, and b) even AMD/RequireJS supports multiple/optional routes for export symbols. And this is one of the stickiest issues at the heart of cross-compatibility: the "exports" symbol can persist and be incorrectly mapped by CommonJS/Node if not explicitly returned!
Thanks, Exports, You're The Real (thing preventing us from reaching) MVP
Jesus, what a nightmare. For this reason, we are going to adjust how our factory closure works:
We are going to explicitly "disable" the "exports" parameter by passing an empty Object ("{}") as the second parameter to the factory.
We are going to explicitly return the module exports from the factory implementation
We are going to explicitly map the results of the factory call to the (file-level) "module.exports" property.
The combination of these adjustments means that, while AMD/RequireJS supports multiple routes, we are going to constrain our module implementations to explicitly returning export symbols from the factory call to route them to the correct CommonJS/Node symbol.
If you don't do this--and I lost some hair debugging this--you end up with a very "interesting" (read: batshit insane in only the way CommonJS/Node can be) bug in which the parent module (require()'ing a dependency module) gets "wires crossed" and has export symbols persist between scopes.
It's bizarre, particularly because it ONLY HAPPENS OUTSIDE THE REPL! So, you can run equivalent module methods from the REPL and they're fine--but trying to map it within the module itself (and then, say, calling it from the command line) will break every time.
So, what does this look like, practically? It means the "define" definition we are putting into the conditional we wrote above looks something like this:
define = (factory) => module.exports = factory(require, {}, module);
It also means our module closure starts with explicitly disabling the "exports" symbol so poor old CommonJS/Node doesn't get wires crossed:
define(function(require, _, module) {
let exports = {};
Sigh. Some day it will all make sense. But then it won't be JavaScript. ;)
Examples
What does this look like "in the wild", then? Here's a GitHub project that provides a reasonably clear example:
https://github.com/Tythos/umd-light/
A brief tour:
"index.js" shows how the entry point can be wrapped in the same closure that uses the "require()" call to transparently load the dependency
"index.js" also shows us how to add a SFJM-style hook for (from CommonJS/Node) running an entry point ("main") should this module be called from the command line
".gitmodules" tells us that the dependency is managed as a submodule
"lib/" contains the submodules we use
"lib/jtx" is the specific submodule reference (don't forget to submodule-init and submodule-update!); in this case it points to the following utility of JavaScript type extensions, whose single-file JavaScript module can be seen here:
https://github.com/Tythos/jtx/blob/main/index.js
- This module uses the same "UMD-light" (as I'm calling it for now) header.
The Problem Child
And now for the wild card. There is, in fact, yet another module export approach we haven't mentioned: ES6-style module import/export usage. And I'll be honest--I've spent an unhealthy portion of my weekend trying to figure out if there's any reasonable-uncomplicated way to extend cross-compatibility to cover ES6/MJS implementations. My conclusion: it can't be done--at least, not without making major compromises. Consider:
They're incompatible with the CommonJS/Node REPL--so you loose the ability to inspect/test from that environment
They're incompatible with a define closure/factory--so there go all of those advantages
They directly contradict many of the design principles (not to mention the implementation) of the web-oriented AMD/RequireJS standard, including asynchronous loading (it's in the name, people!)
They have... interesting assumptions about pathing that can be very problematic across environments--and since it's a language-level standard you can't extend/customize it by submitting MRs to (say) the AMD/RequireJS project (something I've done a couple of times)--not to mention the nightmare this causes in your IDE if path contexts get mixed up!
The tree-shaking you should be able to reverse-engineer from partial imports (e.g., symbol extraction) saves you literally zero anything in a web environment where your biggest cost is just getting the JS from the server and through the interpreter.
If anything, your best bet seems (like THREE.js) to only use them to break a codebase into pieces (if it's too big for a single-file approach, which I try to avoid anyway), then aggregate those pieces at build time (with WebPack, Browserify, etc.) into a module that uses a CommonJS/Node, AMD/RequireJS, or UMD-style header to ensure cross-compatibility. Sorry, ES6 import/export, but you may have actually made things worse. ;(
Posted on August 14, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.