Why TurboRepo Will Be The First Big Trend of 2022
swyx
Posted on December 27, 2021
TurboRepo was acquired by Vercel recently and I caught up on Jared Palmer's excellent intro demo to see what the fuss is all about:
Here are quick notes for those too busy to view the whole thing, followed by personal reflections at the end.
TLDR
See the TLDR of this post in thread form:
Why Monorepos
You can refer to other sources for definitions of Monorepos (2022 edit: Nrwl just launched https://monorepo.tools/ which has their perspectives and comparisons), but we'll spend some time on covering why they are a worthwhile goal:
- You can easily make cross cutting code changes across multiple applications (eg
/frontend
and/backend
) in one atomic commit - You can easily search across all projects
-
Single source of truth for many environment concerns you will want to standardize across your company, for example:
- dependency management (important deps in one
package.json
) - code reuse of shared packages (e.g.
/design-system
or/common-utils
or/schema
) - configs (ESlint, TSconfig, etc)
- tests (from unit to e2e)
- dependency management (important deps in one
- For library authors, it is also easier to publish packages with dependencies on each other.
Major JS ecosystem tools like React, Jest, pnpm, Next.js, and Yarn itself have moved to Monorepos, as have small startups and large companies like FB and Google.
Origin of TurboRepo
The origin story of TurboRepo started with this looongstanding open issue on TSDX from Nate Moore:
As an early volunteer on TSDX I studiously avoided this issue because I never worked at a company with a large monorepo, and thought that it should be solved by dedicated tools like yarn workspace
, which at the time was just gaining traction itself.
To solve this, Jared tried to extract Lerna into a monorepo tool, and when researching how big monorepo shops like Facebook and Google did task running, discovered that a lot of their advanced techniques had not made it into the larger JS ecosystem.
So, TurboRepo was started with 3 objectives:
- make a monorepo tool that utilizes as many of these advanced techniques as possible with zero config
- make it easy to incrementally adopt (eg when moving from Lerna)
- make sure that it scales (eg API design and architectural choices are flexible enough)
The fuller story of TurboRepo is told by Jared in this thread:
What TurboRepo does
The basic principle of TurboRepo is to never recompute work that has been done before.
To do this, it generates a dependency graph from your build pipeline from a turbo
config in package.json, executes each task in turn, and fingerprints the input/caches the output of each task.
When it is run a second time, if it finds work that matches a fingerprint, it restores from cache, and replays the logs.
How to use TurboRepo
The main CLI surface area is surprisingly small:
-
npx create-turbo@latest turbo-demo
scaffolds a monorepo with apps (docs
,web
) and packages (design system and shared configs (eslint, tsconfig)) -
turbo run build
builds all apps at once, but importantly, when you run this command again the second build completes in 100ms because everything is cached. There are a long list of flags you can add to modify whatturbo run
does and outputs. -
turbo prune --scope=<target>
generates a sparse/partial monorepo with a pruned lockfile for a target package. - Remote Caching commands:
turbo login
andturbo link
(explained later)
The turbo
config key
TurboRepo uses a special key in package.json
called turbo
(docs here), and it is here that topological relationships between build tasks (and where to fingerprint for cache artifacts) are defined:
{
"turbo": {
"baseBranch": "origin/main",
"pipeline": {
"build": {
"dependsOn": ["^build"],
"outputs": [".next/**"]
},
"test": {
"dependsOn": ["^build"],
"outputs": []
},
"lint": {
"outputs": []
},
"dev": {
"cache": false
}
}
}
}
This helps Turbo create a Directed Acyclic Graph of your build that it can then walk in reverse for building and checking against its cache. You can even use the --graph
flag to visualize your build graph with Graphviz.
(Having tried out visualization tools before, imo this is a fun demo but not practically all that useful 🤷♂️)
The other important thing to know is that you can run all these tasks together and Turbo will parallelize as much as possible:
turbo run build test lint
To understand what is running in parallel and debug build pipelines, you can even make Turbo output a profile with the --profile
flag to inspect the traces in Chrome DevTools!
Remote Caching
Remote caching is a beta feature, but is set to be far and away the showstopper in making TurboRepo scale. Normally, caches are generated and checked locally, so if you are reviewing code that a coworker has written, you'll have to build it locally too.
Sounds inefficient? We can fix that.
Remote Caching shares that cache globally (this is secure to that extent that hashes are secure), turning TurboRepo from a "single player" experience to a "co-op multiplayer" mode. The analogy that resonates a lot with users is that this is basically "Dropbox for your dist
directory".
This is where Vercel's backing comes in - they are offering free remote caching on TurboRepo builds - you'll still need to make a Vercel account, and they may charge for this later - but this works whether or not your app is built or hosted on Vercel. Brilliant move for everyone concerned! All TurboRepo users get free speedups, Vercel gets a bunch of signups (with network effect) and a possible future revenue source.
Usage is pretty simple:
npx turbo login # login to Vercel
npx turbo link
That's it! Could not be easier, and offers free speedups.
The Future
Jared ended the livestream by making a few comments on the TurboRepo roadmap
- Telemetry
- Sharding Parallel Tasks in other processes (currently, TurboRepo runs parallel tasks in the same singlethreaded process like Node does - to actually make use of full concurrency it should distribute that work to other processes. Temporal, the project I work on, could be an interesting tool for that in future
- Presets (referred to as "Turbo Season 2")
- Smaller features
- Public/private security model like npm
- More intelligent watch mode
- There will probably be Enterprise features too.
You can vote on feature ideas on the TurboRepo GitHub Community as well.
What About Nx?
TurboRepo is most often compared to Nx, so I'm very grateful that Victor Savin (creator of Nx) has written a page on the Nx docs detailing the differences he sees vs Turborepo.
He's also made benchmarks for Nx vs TurboRepo you can try out:
Personal Takeaways
TurboRepo is a big deal for the JS community not just because it addresses build speeds (which are always a crowd pleaser), but also that it is a well defined abstraction that brings a lot of value out of the box, with a declarative build pipeline, great debugging/profiling options, and great docs.
With 74% of its code in Go, TurboRepo is a great example of the Systems Core, Scripting Shell thesis, proving out the idea that the age of "JS tools in JS" is over because the need for speed on hot paths outweighs contributor learning curve concerns.
Many people in the JS community (like my old self) have heard about the benefits of monorepos, but have been held back by the lack of good tooling tackling this problem head on. While there is a long list of monorepo tooling tackling various parts of the problem, I see TurboRepo as leading the charge for the new wave of monorepo tooling that will rise to prominence in the Third Age of JavaScript, thanks to strong backing and great developer marketing from Jared and Team Vercel.
Followup: Nx Chat
I did a followup chat with the Nx founders to learn more about how they think about Monorepo Tooling:
Further reading
Robin Wieruch did a much better writeup on what Monorepos are with code examples and more ideas on use cases!
Posted on December 27, 2021
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.