A Journey to Prevent the Collapse of the Web

digitalcrafting

DigitalCrafting

Posted on January 30, 2024

A Journey to Prevent the Collapse of the Web

This article is an introduction, or rather, a starting point to a tree of articles, that will (hopefully) explain how the software actually works from top to bottom.

Inspiration

As some of you probably guessed, title of this article is inspired by Jonathan Blow's talk Preventing the Collapse of Civilization / Jonathan Blow (Thekla, Inc) on YouTube.

I highly suggest you watch it, even if you've seen it before, because Jon explains everything much better than I ever could in an article.

Here is the gist of it: civilizations collapse, when there is no longer transfer of knowledge between generations, causing younger people to not understand how things that they are using work, and when those things break, they are unable to fix them and so the collapse happen.

Jon also gives a number of examples of software failing as a proof of there being something wrong with software development nowadays, and they are spot on.

Second talk that inspired me is The Thirty Million Line Problem by Casey Muratori (actually, the first 0.5h mostly).

In this one, Casey focuses on where the current problems with software come from, giving an excellent hypothesis on the origin of said problem.

The gist of it being: when the USB and "Plug and Play" was introduced, it became the OS's responsibility to handle all different kinds devices that could now be plugged into the computer, which caused exponential increase in the lines of code in the OS, which spilled over into other areas, which causes people to not understand what is happening in the said systems, as there is simply too much code to allow anyone reason about it. They can reason about part of the code, but it's unlikely they can reason about everything. And if you can't reason about it, it's effectively a magical black box that hopefully works fine if you just leave it alone.

These 2 are the videos that sparked my interest and pushed me into learning in depth about how software actually works, and how can we minimize the number of Lines Of Code (LOC), which I would like to share with you in the future articles.

The Problems we face

Moore's Law is dead

Transfer of knowledge and the number of LOC come from the fact that we, programmers, became too comfortable writing code that is nowhere near what actually happens on CPU and the reason for that is the Moore's Law.

In short (and simplified): the speed of processors doubled every 2 years or so, RAM improved, and we could get away with getting more and more away from the actual hardware, because it could compensate for abstraction. And so, you didn't have to allocate memory manually - there's VM and Garbage Collection. Then you didn't even have to worry about types - you have dynamically typed languages like Python and JS.

Here's the thing though: this law is dead, and has been for quite some time. The CPUs are not getting any faster, and to battle that they got more cores now, but they are not faster. And since most of programs are still single-threaded, that doesn't help anyways. However, it looks like software development didn't caught up onto that fact. We keep creating more and more complicated frameworks, which abstract away many problems but introduce other ones. The software is not getting better, in fact, we might be regressing in some areas, not every area, but most of them.

The websites are slow as hell, backend systems fail with cryptic messages, Windows update forces itself onto you, "smart" phones are getting slower with each update forcing you to buy new phone, the same with "smart" TVs. And these are just the things you probably experienced yourself, there are also fails we will never hear about, because they're just too embarrassing for the companies, like failures in the plane software.

It's also kind of interesting that we all experience those things, but we're so used to it, that we just go "Eh, let's buy newer faster phone/laptop/pc", "Let's refresh the page and try again", "Let's restart the system", and I was like that too. It wasn't until I saw Jon's video, that I actually stopped thought about it, and decided that that's actually pissing me off quite a bit.

Companies value speed

Another problem is that companies, especially startups, value speed over quality, at least until it comes back to bite them in the behind. And it will come to bite them, but it's usually too late to quickly switch framework or programming language.

It might not be apparent right away, after all, you build a new app and it's great - it's fast, easy to manage, what not to love?

Well, until it grows and you need to scale up. Suddenly, Python or JS is not enough, but you are not going to rewrite it in C or GO, right? Microservices are a solution for scaling, or at least, that's what you were taught so far, so you do that, and then it becomes a nightmare to manage. You introduce Docker, K8S, you suddenly loose ability to debug whole process easily, changing anything at all becomes super hard and the spiral of death starts.

And it's not my imagination, I've seen this. This is how it goes when you start with a programming language that is easy and quick to write in, and it grows faster than you anticipated.

Casey Muratori in yet another video gives examples of when companies actually cared, however, it was usually a large company that could afford it.

The Stack

There is a quote I once read and it stuck:

Fullstack is not an achievement, it's a minimum f*****g requirement

And I think it should be a standard, but how can it? The minimum requirement for a job went from algorithms and a programming language (+ css in case of frontend) to a whole stack of stuff built on top of another stuff. Let's try to list what a Fullstack position would require nowadays, shall we?

Angular/React, Typescript, NodeJS, UI library, Websocket, REST, gRPC, Java, Spring BOOT/Micronaut, JPA/Hibernate, SQL database, NoSQL database, Docker, Kubernetes, AWS/Azure/GCP...

And probably something I forgot about. That's around 15 things to grasp. Even if you combine backend and frontend to just using Javascript, it's not feasible for anyone to learn about anything deeper than just using those things in less than a few years. And I'm not even talking about memory management, I'm talking about things like all the possible HTTP Headers you can use, and what you would do with them, or lesser known classes in Vanilla Java.

Getting the knowledge is kind of hard

The videos or articles I usually see on YouTube or LinkedIn are targeting Junior Devs. It seems like every month I see the same articles shared again and again: "How to create CRUD application in Spring BOOT", "How to handle errors in Spring BOOT" and similar in other frameworks/languages. How many times are going to repeat the same stuff, instead of going forward?

Don't get me wrong, there are people with the knowledge willing to share it, there are videos, articles and books, that explains how software works, however, what I found is that there is no single resource that would explain it in relatively simple terms, from top to bottom. Meaning, if I write code a certain way, how does it look on the CPU and in the memory? All the resources I found either explained only one small part and left it at that, or were a full fledged books with intricate details right from the start, which took way too much time to go through for someone like me, who likes to learn by example, get a general understa and get the details later.

So, what now?

Well, first of all, start with yourself. The title of this article doesn't contain word "journey" just for show. All these issues that I listed, bothered me greatly. The videos I posted at the beginning, launched me on a journey to understand how software works, and the last problem I listed gave me an idea to create something like this myself.

I already implemented a simple language parser, interpreter and VM. I wrote a disassembler for Assembly 8086 and very simple 8086 CPU simulator, and tried to learn many more things, which was quite a journey. And it's still going.

What I found out is:

It's way easier than you think

It's just a bit tedious sometimes, and developers hates tedious. But it's also rewarding, by understanding how things work, you will be able to make a better decisions.

A quick example: LinkedList vs ArrayList in Java. Which one to use? Text-book answer would be that LinkedList is better if you add and remove a lot's of elements, however ArrayList is better if you mostly iterate over it. However, that answer only takes into consideration the Big O notation and not what actually happens on the CPU. If you dig a bit deeper, the ArrayList is usually the best solution, and it's because of how CPU accesses data.

Performance awareness

There are a number of resources you can use to learn about internals of programming, but the one that gave me the most, is this one: Performance-Aware Programming Series by Casey Muratori. It's unfortunately paid, but the money is well worth the content, especially because Casey gives a very detailed explanation of how CPU works and gives excellent examples to get you started.

I actually learned Assembly 8086 from this course, and it was super easy thanks to Casey's explanations.

The articles

I will try to write a number of articles which, hopefully, will provide people with useful insights into how the software works and how to use this knowledge in everyday programming life.

I will update this section with articles as I go, there will be few different topics, including:

  • How programming works top to bottom
  • How to implement stuff from scratch (HTTP Server, DB, etc)
  • Performance benchmarks and explanations of different structures/programming styles (this one in Java mostly)

As I am a Java and Angular developer mostly, most of the code will be written in, well, Java and JS :) with some addition of C, C++ and Assembly when needed.

Also, note that my explanations will contain many simplifications. It's not feasible to write every single detail accurately, and it's not even that useful, since we tend to forget details anyways. I'm aiming this articles at web developers, both backend and frontend, so that they can get a better understanding of what they're actually writing, and maybe think if it's worth it before jumping onto the next new and shiny thing :)

Happy reading! :)

Java Benchmark Adventures

Programming from Top to Bottom

💖 💪 🙅 🚩
digitalcrafting
DigitalCrafting

Posted on January 30, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related