The Lindy Effect: Explaining the Longevity of Legacy Software

tobiasvanderwerff

Tobias van der Werff

Posted on August 5, 2024

The Lindy Effect: Explaining the Longevity of Legacy Software

"Wanted: people who want to learn a programming language from 1959," headlined a recent Dutch newspaper article. The language in question is COBOL, a language so old that most of the programmers who know it are either dead or retired. Nonetheless, banks, government institutions, and other large organizations still rely on this ancient language for some of their most critical infrastructure.

How is it possible that a language like COBOL is still used today? What has prevented these organizations from upgrading their software, like, several decades ago? I recently discovered a useful mental model for thinking about this: The Lindy effect.

The Lindy Effect

The Lindy effect states that the longer an idea or technology has survived, the longer it is likely to stay alive. Nassim Taleb highlights this concept in his book Antifragile, where he explains that for non-perishable things like ideas, music, and technology, age is a sign of robustness, which in turn is a positive indicator of future longevity.

“If a book has been in print for forty years, I can expect it to be in print for another forty years. But, and that is the main difference, if it survives another decade, then it will be expected to be in print another fifty years. This, simply, as a rule, tells you why things that have been around for a long time are not “aging” like persons, but “aging” in reverse. Every year that passes without extinction doubles the additional life expectancy. This is an indicator of some robustness. The robustness of an item is proportional to its life!“ — Nassim Taleb

The typical framing of the Lindy effect is that those ideas that endure over time likely do so because they retain their usefulness, and therefore have long-term value.

The Lindy Effect Applied to Software

The Lindy effect provides an interesting framework for understanding the longevity of software. In particular, I'd like to discuss the rise and decline of programming languages as a manifestation of a similar effect. Why do some programming languages survive, and others don't? For example, C and Pascal are both programming languages that originated in the 1970s. However, C remains widely used today, while Pascal has largely fallen out of favor.1

The typical Lindy explanation for this discrepancy relates to a difference in quality. The fact that C has survived this long and is still widely used indicates its durability and lasting value. By contrast, Pascal has faced more issues and has largely been replaced by more modern languages.

But quality is not the only reason why programming languages survive. Once a language gains sufficient traction, it's existence can become self-reinforcing. Some software gets ingrained deeply enough into the infrastructure of a society to practically guarantee its future existence. The obsolescence of these critical pieces of software will only occur if there is a conscious and deliberate effort to replace them, which becomes less likely over time.

It helps to understand that software derives much of its greatest strength from its ability to be layered on top of other software. Over time, software becomes increasingly powerful, and increasingly complex. This is generally a positive thing. For instance, a video game programmer does not need to write their own physics simulation -- they can use Unreal Engine. Similarly, a web developer can use a framework like React or Django to speed up their work.

However, there is a clear downside to this increasing complexity that relates to critical points of failure. Modern software is like a tower of stacked dependencies, and the oldest software tends to be at the bottom. As we create more software, the impact of the software at the bottom grows over time, as does the impact of changing it. This is the kind of software that becomes a critical point of failure for much of our digital infrastructure, where the smallest bugs or security flaws can have a gigantic ripple effect. Recent examples of this include the the Log4j bug from 2021, as well as the Crowdstrike bug from this year that caused global outages.

xkcd: dependency

xkcd

Over time, the effect of changing this kind of software becomes so large that making any serious modifications can be practically impossible. The massive complexity, scale, and interdependencies involved lead to a growing inertia -- a resistance to change the existing software. A financial institution like a bank will think twice before touching the software that handles all of their financial transactions (written in COBOL) -- the consequences of a mistake are simply too great.2

So if we think back to the example of COBOL, and why it is still being used to this day, we can understand why: It is too deeply ingrained into our existing infrastructure. Lots of dependencies and complexity have formed over time, which means that as a programmer, you just might crash the global financial system if you forget to add a semicolon somewhere.3

Bad Design Ossifies Software

Poor software design makes the problem of dependencies worse. There is a reason why software engineering best practices advocate for things like modularity and separation of concerns: it minimizes dependencies across a codebase. Problems arise when software modules make lots of assumptions about other modules they interact with, creating rigidity and an inability to change anything without lots of unforeseen consequences.

This usually follows a common pattern. Someone creates a temporary fix (hack), followed by lots of changes built on top of the original fix, making the original fix increasingly hard to remove. Organizations that take software health seriously eventually service this kind of technical debt and resolve the problematic dependencies. But organizations that don't do this accumulate lots of problematic dependencies over time. Eventually, their codebase is such a jumbled mess of spaghetti code that no one dares change it (after all, who knows what will happen?).

Even well-designed software can be difficult to replace. Translating a codebase from one programming language to another leaves plenty of room for new and unforeseen bugs. This is the kind of thing that banks (rightfully) fear when they consider replacing their financial transactions software.

Old Software Ain't So Bad

Old software isn't inherently a bad thing and can provide lots of value. For instance, the older a piece of software is, the more likely it is that mistakes and bugs have been removed from it. Furthermore, new technology does not necessarily perform better than old technology -- in fact, the opposite often holds true in software.

For example, Python is built on top of C. Although C is, by modern standards, a horrible language that no one should use, it is at least fast. As long as there are C programmers around who can maintain the old infrastructure, there isn't much of a problem. Similarly, Fortran is still used in numerical analysis software because it is highly efficient for those kinds of workloads. Same thing with COBOL: It is highly optimized for financial transaction systems.

Instead, old software becomes problematic when it becomes increasingly hard to maintain over time. In the case of a language like COBOL, there are fewer and fewer engineers who can understand and modify the existing software. This reinforces the Lindy effect -- no engineers are available to replace the old software because the engineers that are available need to maintain it.

Conclusion

I believe that the rise and decline of programming languages is a natural phenomenon. Priorities change, paradigms shift, and this is reflected in the languages we use as software engineers. However, the languages that stick around are worth studying. Sometimes, their durability is merely a sign of technological inertia and complicated dependencies. Other times, a more positive version of the Lindy effect may be at play -- their durability might hint at design principles that are worth preserving.

Related Reading


  1. C is still ranked the 3rd most popular programming language as of writing this. 

  2. On a related note, this is also why it's so hard to upgrade the electric power grid, or why we are still using the QWERTY keyboard layout despite the existence of more efficient layouts

  3. Needless to say, you wouldn't push straight to production for any kind of serious infrastructure software. 

💖 💪 🙅 🚩
tobiasvanderwerff
Tobias van der Werff

Posted on August 5, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related