Some things I learnt from working on big frontend codebases
Stefano Magni
Posted on June 1, 2023
Until now (May 2024), I had three experiences working on very big front-end (React+TypeScript) codebases: WorkWave RouteManager, Hasura Console, and Preply.com. The former two are ~ 250K LOC, while Preply.com is close to 1M LOC, and the experiences are very different. In this article, I report the most important problems I saw while working on them, things that usually are not big deals if working on smaller codebases, but that become a source of big friction when the app scales.
Photo by Sander Crombach on Unsplash
Changelog
- May, 2024
- add Non-straightforward CI scripts
- add Components accepting
className
- add Not tracking architectural decisions
- add Spreading external dependencies and implementation details
- add Hide stores implementation details
- add Consuming Swiss army knifes
- add Major product changes and refactors
- update Never updating the NPM dependencies
- June, 2023
- first article publication
My direct experience
First of all, let me describe the main characteristics of the two projects:
WorkWave RouteManager: the product is very complex due to some back-end limitations that force the front-end to take care of a bigger complexity. Anyway, due to the strong presence of the great front-end Architect (that's Matteo Ronchi, by the way), the codebase can be considered front-end perfection. The codebase is completely new (rewritten from scratch from 2020 to 2022), and trying and using the latest tools happened on a high cadence (for instance: we started using Recoil way sooner than the rest of the world, we migrated the codebase from Webpack to Vite in 2021, etc.), and the coding patterns are respected everywhere. The team was made by the four front-end engineers, including the architect and I. Here I was the team leader of the front-end team.
Hasura Console: the complexity of the project is not so high but the startup needs (pushing out new features as soon as possible) and the very back-end nature of the platform later resulted in huge technical debt and antipatterns that became into big friction points for the developers working on the front-end. The team was made of 12 front-end engineers, and later on the company decided to ditch the front-end project creating a 50x smaller one, and keep only the back-end/CLI projects. Here, I joined as a senior front-end engineer and then I became the tech lead of the platform team.
Preply.com: Preply scaled following a very experiment and data-driven approach, given its B2C nature and the million of users taking lessons on the platform every day. The natural business orientation lead to heavy outdated front-end dependencies and hard-to-work-with front-end projects. At the same time, Preply's goal to become a strong brand, to grow in the B2B market, and the tireless dedication to its internal culture and employees' satisfaction, drove the company to care a lot about internal tech excellence, to create a DevEx team inside the Platform team, and to lead some specimen tech initiatives. The company counted ~40 frontend-end engineers, some of them dedicated to React Native. I joined the platform team as a senior front-end engineer and then I moved to the Design System team.
Following, is a non-exhaustive list of examples coming from some of the characteristics/activities/problems I saw, grouped by categories.
Table of contents
Generic approaches
Managing more cases than the needed ones
This innocent approach leads to big problems and a waste of time when you have to refactor a lot of code trying to maintain the existing features. Some examples are:
Components/functions with optional props/parameters and fallback default values: when you need to refactor the components you need to understand what are the indirect consumers of the default values... But what happens if the usage of the default values is driven by network responses? You need to understand and simulate all the edge cases! And what happens if you find out that the default values are not used at all? I once saw a colleague of mine wasting four hours during a refactor for an unused default value...
Types that are typed as a generic
string
or genericrecord<string, any>
when in reality the possible values are known in advance. The result is a lot of code that manages generic strings and objects while managing the real finite amount of cases would be 10x easier. Again, when you need to refactor the code managing "generic" values, you are going to waste time.
I touched on these topics in my How I ease the next developer reading my code article.
Leaving dead code around
You refactor a module, you remove an import of an external module and you are fine. What happens if the module was the last consumer of the external one? The external module becomes dead code that will not be embedded in the application (nice) but that will confuse everyone that's going around the codebase looking for solutions/utilities/patterns and will confuse the future refactorer that will blame anyone that left the unused module there!
And obviously, it's a waterfall... the external module could import other unused modules and they could depend on an external NPM dependency that could be removed from the package.json, etc.
Internal code dependencies and boundaries
Not enforcing (through ESLint rules or through a proper monorepo structure) strong boundaries among product features/libraries/utilities bring unexpected breaks as a result of innocent changes. Something like FeatureA imports from internal modules of FeatureB that imports from internal modules FeatureA and FeatureC, etc. brings you to break 50% of the product by changing a simple prop in a FeatureA's component. And if you have a lot of JavaScript modules never converted to TypeScript, you will also have a hard time understanding the dependency tree among features...
I strongly suggest reading React project structure for scale: decomposition, layers and hierarchy.
Implicit dependencies
They are the hardest things to deal with. Some examples?
- Global styles that impact your UI's look&feel in unexpected ways
- A global listener on some HTML attributes that does things without the developer knowing about them
- A generic MSW mock server that all the tests used but it's impossible to know what handlers are used by what tests
Again, poor the refactorer that will deal with those. Instead, explicit imports, speaking HTML attributes, inversion of control, etc. allow you to easily recognize who consumes what.
Spreading external dependencies and implementation details
External dependencies should be hidden and consumed by controlled code if you write a custom function called addOneDayToDate
compared to spread dateFns.addDays(currentDate, 1)
everywhere is better because the function depends on DateFns, which is centralized, easy to test, and change.
Big modules
This is another very subjective topic: I prefer to have a lot of small and single-function modules compared to long ones. I know that a lot of people prefer the opposite so it's mostly a matter of respecting what is important for the team.
Code readability
I'm a fan of the The Art of Readable Code book and after spending 2.5 years working on a big and complex codebase with zero (!!!) tests, I can tell how much code readability is important.
This also really depends on the number of developers working on a codebase, but I think it's worth investing in some shared coding patterns that must be enforced in PRs (or even better if they can be automated through Prettier or similar tools).
I publicly shared the ones we were using in WorkWave in this 7-article series: RouteManager UI coding patterns. The internal rule we had was that "patterns must be recognizable in the code, but not authors".
No silver bullets here, the important thing IMO is that readability and refactorability are kept in mind by everyone when writing code.
Uniformity is better than perfection
If you are about to refactor a module but you do not have time to refactor also the two modules that are coupled to it... Consider not refactoring it to leave the three modules uniform among them (uniformity means predictability and less ambiguity).
Working flow
Not tracking architectural decisions
Architectural decisions and changes are key to comprehending why a project was designed and how it evolved over time. Usually, these decisions are not 100% reflected in the codebase since big codebases always require incremental approaches.
It is important to track those decisions to avoid dealing with approaches partially applied, refactors partially done, etc., without a precise idea about the timeline of those decisions and what they were trying to solve.
Usually, this problem explodes when the engineers who remember those decisions leave the company, and the new ones are doomed to remain ignorant forever.
On a small scale, this also refers to the awkward changes and/or shenanigans you make to get something done. See this great Josh W. Comeau's example.
No PR description and big PRs
That's such an important topic that I wrote four articles about it. Start with the most important one: Support the Reviewers with detailed Pull Request descriptions
And if you are curious you can dig into some real-life examples I documented here
- A Case History: Analysing Hasura Console's code review process
- https://dev.to/noriste/re-building-a-branch-and-telling-a-story-to-ease-the-code-review-485o
- Improving Hasura's Internal PR Review process
Suggesting big changes and approaches during code reviews
PRs are not the best place to suggest big changes or completely change the approach because you are indirectly blocking releasing a feature or a fix. Sometimes is crucial to do it, but maybe the initial analysis and estimation steps, pair programming sessions etc. works better to help shape the approach and the code.
When to fix the technical debt?
That's a great question, no silver bullet here... I could only share my experience until now
- In WorkWave we were used to dealing with technical debt on a daily basis. Fixing tech debt is part of the everyday engineers' job. This can slow down the feature development in favour of having a deep knowledge of the context and keeping the codebase in a good shape. It's like knowing that you are slowing down today's development to keep tomorrow's development at the current pace.
- In Hasura, we cannot deal with technical debt due to the needs to deliver new features. This transformed in a lot of frontenders going slower compared to their potential, sometimes introducing bugs, and offering an imperfect UX to the customers. It happened after years, obviously.
- In Preply, engineers can dedicate 20% of their time to tech excellence initiatives, some of them driven by the company and other ones proposed by the teams themselves.
You can read more about a good example of Hasura's problems in my Frontend Platform use case - Enabling features and hiding the distribution problems article. Also, you could read what happened to our E2E tests here after all the tech debt problems we were facing.
Major product changes and refactors
Major product changes (the complete rewrite of WorkWave RouteManager, Preply's complete rebrand, etc.) are also perfect for introducing refactors or clearing tech debt that has been there for ages. The reason is that all the knowledge accumulated in the previous years gives us a more comprehensive vision of what is needed and what needs to be cleared, leaving the new product way better than when it started (a sort of greenfield project inside an existing product).
No front-end oriented back-end APIs
By "no front-end oriented" I mean APIs not designed with the end customers' UX in mind and a lot of complexity pushed to the front-end in order to keep the back-end development lean (ex. Embedding a lot of DB queries in the front-end avoiding exposing a new API from the back-end). This approach is natural during the initial evolution of a product but will lead to more and more complex front-ends when the product needs to scale.
Never updating the NPM dependencies
Again, based on my own experiences:
- In WorkWave I used to updating the external dependencies on a weekly basis. Usually, it takes me 30 minutes, sometimes 4 hours.
- In Hasura, we were used not to update them, finding ourselves, enabling
legacy-peer-deps
by default, leveraging NPM'soverrides
and being unable to update any GraphQL-related dependency. Other than having a lot of PRs that completely break the build because of a new dependency. - In Preply, the outdated TypeScript version made impossible to enable
exactOptionalPropertyTypes
andnoUncheckedIndexedAccess
which caused more than one production incident. At the same time, the need to become SOC2 compliant (necessary to expand in the B2B market) got caring about the dependencies a first-class citizen (after a big months-long initiative to update all of them).
And since maintaining dependencies has a cost, you should carefully consider if you really need an eternal dependency or not. Is it maintained? Does it solve a complex problem I prefer to delegate to an external part?
As an alternative approach, valid only for very very small projects, you can also consider to copy/paste the code of some dependencies inside a "vendor" directory, linking the original project and tracking which version the code belongs to (at the cost of not being able to update it and that other must know they should not install the same dependency).
TypeScript
Bad practice: Generic TypeScript types and optional properties
It is very common to find types like this
type Order = {
status: string
name: string
description?: string
at?: Location
expectedDelivery?: Date
deliveredOn?: Date
}
that should be represented with a discriminated union like this
type Order = {
name: string
description?: string
at: Location
} & ({
status: 'ready'
} | {
status: 'inProgress'
expectedDelivery: Date
} | {
status: 'complete'
expectedDelivery: Date
deliveredOn: Date
})
that is more verbose but acts as pure domain documentation, removes tons of ambiguity, and allows writing better and clearer code.
The topic is so important and has so many great advantages that I wrote a dedicated article to the topic: How I ease the next developer reading my code.
Type assertions (as
)
Type assertions are a way to tell TypeScript "shut up, I know what I'm doing" but the reality is that barely you know what you are doing, especially thinking about the consequences of what you are doing...
This happens very frequently in tests, where big objects are "typed" with type assertions... Resulting in the object going outdated compared to the original type... But you realize it only when the tests will fail and you now left room for a lot of future doubts about the test failures...
The solution: type everything correctly and eventually prefer @ts-expect-error
with an explanation of the error you expect.
Read Why You Should Avoid Type Assertions in TypeScript to know more about the topic (and keep in mind that also the JSON.parse
example shown there can be typed by using Zod parsers).
@ts-ignore
instead of @ts-expect-error
and broad scope
@ts-expect-error
issues could be auto-fixable in the future, compared to @ts-ignore
(that's another way to shut up TypeScript).
More, @ts-expect-error
should be applied to the smallest possible scope to TS accepting unintended errors.
// ❌ don't
// @ts-expect-error TS 4.5.2 does not infer correctly the type of typedChildren.
return React.cloneElement(typedChildren, htmlAttributes); // <-- the whole line is impacted by @ts-expect-error
// ✅ do
return React.cloneElement(
// @ts-expect-error TS 4.5.2 does not infer correctly the type of typedChildren.
typedChildren, // <-- only typedChildren is impacted by @ts-expect-error
htmlAttributes
);
any
instead of unknown
TypeScript's any
gives you freedom (that's generally bad) of doing everything you want with a variable, while unknown
forces you to strictly guarantee runtime the runtime value before consuming it. any
is like turning off TypeScript while unknown
is like turning on all the possible TypeScript alerts.
ESLint rules kept as warnings
ESLint warnings are useless, they only add a lot of background noise and they are completely ignored. Rules should be on or off, but never warnings.
Validating the external data
In the software world, the rule of "never trust what the frontend sends to the backend" is crucial, but I'd say that in a front-end application armed with TypeScript types, you should not trust any kind of external data. Server responses, query strings, local storage, JSON.parse, etc. are potential sources of runtime problems if not validated through type guards (read my Keeping TypeScript Type Guards safe and up to date article) or, even better, Zod parsers.
React
HTML templating instead of clear JSX
JSX which includes a lot of conditions, loops, ternaries, etc. are hard to read and sometimes unpredictable. I call it "HTML templating". Instead, smaller components with a clear separation of concerns among the components are a better way to write clear and predictable JSX.
Again, I touched on this topic in my How I ease the next developer reading my code article.
Lot of React hooks and logic in the component's code
I'm a great fan of hiding the React component's logic into custom hooks whose name clearly indicates the scope of the hook and the consuming it inside it. The reason is always the same: long code before the JSX makes reading the JSX harder.
Components accepting className
Components are designed to encapsulate and hide some logic and give the external world the result of this logic. Their UI is part of the encapsulated APIs the consumer should not be able to change. Usually, components also accept className
to allow consumers to customize small parts of the component's UI (this is the initial goal). Instead, the result is an uncontrolled and hard-to-predict backdoor to rape all the UI details of the components and their children in seconds.
Like all the JavScript details, the styling details should be encapsulated and hidden, exposing only some generic configurations to the consumer. These configurations explicitly mark what the component offers and what the consumer wants to obtain (like variants
, type
, mode
, etc.).
As a refactorer, when you see a component accepts className
, you already know how hard your life will be.
Hide stores implementation details
Something I saw working like a charm on WorkWave RouteManager is hiding the stores under modules that export React-only, store-independent APIs. We started using Recoil way before the rest of the world, and later on, we migrated to Valtio because it better covered our needs. The migration was painless because Recoil was just an implementation detail of modules that export pure React APIs like useSelection.
Consuming Swiss army knifes
Some React components are generic by design due to the huge number of cases they manage (think of a table, a date picker, a modal, etc.). This makes hard to track what consumers need out of the 100 features those generic components do. As a result, refactoring those components or the consumers is hard. My suggestion is to create intermediate and vertical components that act as proxies for the more complex ones.
The vertical components' name and description allow the reader to understand what they do and need without digging into the details of how the original complex components are consumed (for instance: UserList
which uses just the sorting options of Table
is clearer compared to digging into how Tutors
, Students
, and Managers
pages use Table
).
Tests
Bad tests
As a test lover and instructor (I teach about front-end testing at private companies and conferences) I can say that bad tests are the result of lacking experience on this topic, and the only solution is help, mentoring, help, mentoring, help, mentoring, etc.
Anyway, the false confidence that tests can offer is a big problem in every codebase.
I suggest reading two of my articles:
- From unreadable React Component Tests to simple, stupid ones
- Improving UI tests' code with debugging in mind
E2E tests everywhere
E2E tests do not scale well because of the need for real data, a real back-end, etc.
From this perspective, Preply is a great example (and the only successful one I saw in my working life) of what you can achieve when the user experience is considered crucial by the leadership: E2E test are mandatory and the strong Continuous Delivery approach lead to 98%+ stability of the E2E test suite that ensure a lot of happy path are always functional.
Also, in this case, I suggest reading some of my articles:
- Decouple the back-end and front-end test through Contract Testing
- Improving UI tests' code with debugging in mind
- Front-end productivity boost: Cypress as your main development browser
Developer Experience
Deprecated APIs
When code is marked as @deprecated
, the IDE shows it as strikethrough'ed and present the documentation, helping the developers realize that they should not use it.
An example:
/**
* @deprecated Please use the new toast API /new-components/Toasts/hasuraToast.tsx
*/
export const showNotification = () => { /* ... */ }
Care about the browser logs
Console warnings (coming from ESLint, from TypeScript, from React, from Storybook, etc.) add a lot of background noise that mixes with the important logs you could trace. Care and remove them in order to avoid the developers ignoring your own important alerts due to the high noise.
Developer alerts for unexpected things
Runtime things (ex. server responses) could not be aligned with the front-end types. If you do not want to break the user flow by throwing an error, at least track the error through something that could alert you about it (like Sentry, or whatever other tool), so a short time will pass between the error coming out and you fixing it.
React-only APIs
If you are creating an internal library, prefer to expose only React APIs. The big advantage is that you count on React's reactivity system, and managing dynamic/reactive cases in the future will be easier because you are sure the consumers of your React APIs are re-rendered for free and always deal with fresh data.
Non-straightforward CI scripts
CI pipelines should just launch scripts present in the package.json without additional logic that increases the cognitive load and makes it harder to replicate errors locally or in another environment. Think about the painful process of trying to decipher what a CI step does to replicate it locally to dig into the root cause of the issue. Maybe CI uses a tool you do not master, maybe CI uses a particular configuration, and all these take you depending on other colleagues/teams who own everything CI.
CI pipeline should only care about setting up everything with the correct Node.js version (set by the frontenders who maintain the codebase) and launching some CI-dedicated scripts (for instance: ci:lint
, ci:build
, ci:ts-check
, ci:test:unit
, ci:test:e2e
, etc.). This decouple the scripts launched in CI from who knows better the JS ecosystem, getting everything simpler.
Credit where credit is due
Thank you so much to M. Ronchi and N. Beaussart for teaching me so many important things in the last few years ❤️ a lot of content included in this article comes from working with them on a daily basis ❤️
Posted on June 1, 2023
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.