How many Harry Potters can your team maintain?
detoix
Posted on October 24, 2023
Talking to the business is hard. Convincing them that your application requires time for refactoring or maintenance is even harder. I've been there and I want to share with you an interesting metric that bridges business people and engineers.
A couple of months ago I was assigned with a task to explain why we need time for refactoring, why can't we just remove the code we don't need. And it was a struggle because I didn't know how to quantify the system we had. How do I tell the business people how much cognitive effort we put into just understanding what's under the hood of a running system?
And I came up with an idea to compare it to a book. Simple as that: every line of code takes time to read and understand, just as every line of a book.
How much is a Harry Potter?
I chose Harry Potter series because it's well-known (eventually it turned out to be the bull's eye). I found a couple of places like this that claim that
the Harry Potter books contain 1,084,170 words
Yeah, maybe. My concern was to find out lines count.
Why lines?
A line of code is an entity a developer works with in the first place. Also, it is easy to imagine a line of written text. My assumption here is that every word somewhat corresponds with a single token in code (variable, assignment, invocation), because it represents a single thing to understand in the bigger context.
So I decided to find out myself how many lines are there in the Harry Potter books. And the results are as follows.
Title | Word count | Line count |
---|---|---|
Harry Potter and the Philosopher's Stone | 76,944 | 7,890 |
Harry Potter and the Chamber of Secrets | 85,141 | 8,695 |
Harry Potter and the Prisoner of Azkaban | 107,253 | 11,190 |
Harry Potter and the Goblet of Fire | 190,637 | 19,077 |
Harry Potter and the Order of the Phoenix | 257,045 | 23,004 |
Harry Potter and the Half-Blood Prince | 168,923 | 13,454 |
Harry Potter and the Deathly Hallows | 198,227 | 20,630 |
Total | 1,084,170 | 103,940 |
I did some double checks and confirmed these numbers - 100k lines in the whole series. Now, how does it compare to actual code? I used some of my onliners to get some data about popular repositories.
Title | Token count | Line count |
---|---|---|
Autofac | 152,249 | 42,137 |
three.js | 1,614,528 | 272,856 |
pandas | 1,804,215 | 457,886 |
Total | 3,570,992 | 772,879 |
As you can easily notice, the ratios are much different. Whereas in a typical novel there are about ten words per line, in the code I pulled there is less than 5 tokens per line.
These two numbers are difficult to compare, but I would argue that software code requires significantly more cognitive effort to be understood. Remember that we're doing this to establish some common ground between business and software people, these numbers don't have to be equal.
This is the part where you can disagree, but for the sake of simplicity of calculations I claim the following.
1 Harry Potter ~ 100k lines of code
With this in mind I literally put it on a Power Point slide and said that
we currently maintain an equivalent of X Harry Potter series with a team of Y engineers and that's why we have to simplify the system
and I think i made a point. If you're reading this for the first time and this idea is fresh in your mind, try to imagine being a business professional and learning this from a software engineer. Is it intelligible? Does it help you understand the situation?
Such comparisons, the Harry Potter Metric in particular, can be a novel approach to bridging the gap between business professionals and engineers; it's simple to calculate and easy to understand for both parties.
So, how many Harry Potters can your team maintain?
Posted on October 24, 2023
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
May 16, 2024