Learning to think small

splrk

Splrk

Posted on April 13, 2020

Learning to think small

I've been writing code professionally for the last 12 years. Most of that time was spent without a mentor or code reviews. Maintaining software this long exposed a lot of erroneous practices and habits. I'm taking time to write these lessons learned in a series of articles.

Background

I'm currently refactoring a project that received a lot of praise upon release. Long story short, me and one other developer took version 1 of a project written by a previous developer and rewrote much of the code for version 2. The second version performed better and was more responsive. I felt proud of the feedback. However, I received no feedback on the code itself, just the end product.

Fast forward nearly two years and the project desperately needs some updates. I am now the sole maintainer and the application hasn't changed much. It's possible parts of the code will merge with another very similar project so the code needs to be maintainable for future developers. I made two major mistakes that hinder this goal, one past and one present.

Breaking things down

When we wrote version 2 we did so in a bit of a rush. Mainly because this was already a working product and needed fixes for some major issues if it was to continue being used. We needed version 2 quickly and the nature of the problems required rewriting large parts of it. As a result I wound up cramming business logic, I/O and data queries into the same module. I now hate my past self.

Take the example of a file upload queue. We were experiencing some inconsistency with uploading more than one file at a time to our digital asset server. This led to a requirement to restrict the app to upload only one file at a time.

The module responsible for this was over 1,000 lines long - just for a file upload queue. Here's a sampling of the function names

...
function startFileLoop();
function buildTempDirectory();
function packageZip();
function upload();
function saveToDisk();
function checkNetworkResponse();
...

Furthermore, this module was named Files. There is no clear indication of what this module handles. Is it a queue? Is it network I/O? Is it disk I/O? Unfortunately, this is not the only module written this way.

I wanted to do too many things at once. I took a high-level requirement and tried to code it all at once in the same module. I was thinking too big failing to break the problem into smaller, manageable chunks. Every new function would be dumped into an already existing file, sometimes appended other times just inserted wherever the cursor sat. This led to spaghetti code that became horrible to manage. To find a bug I'd have to sift through a huge module that had no clear organization. Sometimes I'd find a function I thought dealt with the problem only to find it called another function with a similar name.

Secondly, this code was virtually impossible to test. This module did too many things. If I were to start writing tests, I'd have test modules probably 4 times as long. I wasn't thinking ahead, but rather got sucked into the rush of trying to publish.

In short, I created large modules and mixed concerns in my code. This made the code

  1. Hard to debug
  2. Hard to test

What I wish I had done instead

Let's go back to the example of the file queue. Instead of jamming so many functions into the same module, here is the approach I took while refactoring.

I turned the high level requirement (Send files one at a time allowing the user to append files to the queue) and broke it down to it's basic functions. I needed the following modules:

  1. actionQueue.js - A Queue of Promises.
  2. api.js - A wrapper for calling the asset management API.
  3. filePrep.js - A Disk I/O caching mechanism.
  4. uploadQueue.js - A singleton instance of actionQueue package and sends files to the asset management API
  5. filesModel.js - An object that represents the file meta data required by the asset management system.

Now I have 5 different modules that are testable and have a single responsibility. When I have a bug, I first go to the module that is most likely the problem and start there.

Secondly I can easily start writing tests for these separate pieces of the application. This way when I start adding features or fixing bugs, I can make sure I don't break existing functionality without booting up the entire app.

Learning to Refactor well.

The second mistake I made was more recent. When I finally sat down to do some refactoring, I was horrified. I decided this code needed to be cleaned up and right away. I jumped into the source and began ripping apart one of the old modules and creating new ones.

Over a month later I made my first commit on my new branch. Not my first merge, my first commit. GitLab summarized the commit:

Showing 54 changed files with 2575 additions and 2143 deletions

Finally, about a month and half later I merged my refactor with the following summary:

Showing 73 changed files with 3267 additions and 2839 deletions

I took nearly two months refactoring one module. That's two months of coding that provided no value to the end user. What's worse is that I had no tests written and therefore no quick way of verifying that new code accomplished the same things as the old code. Although clean code is important for making future improvements, I let it become too important. This led to large code changes, no tests and delayed releases.

What I wish I had done instead

Instead of trying to refactor everything haphazardly in one commit, I should have worked on things in smaller chunks using the following steps to guide me.

  1. Decide on a single piece of functionality to refactor
  2. Write integration and / or functional tests that pass in the current code base.
    • Try to write these so that the refactor does not require rewriting the tests
    • Don't write unit tests. This will create a lot of unnecessary work during the refactor.
  3. Refactor the code, ensuring all tests pass.
  4. Optionally add a new feature with a test for it.

This approach was greatly inspired by Your code might be gross for a reason from the podcast J.S. Party. If you are considering a refactor or even in the middle of one, I suggest you check out that episode.

Think small

At the end of the day, both mistakes happened because I tried to work on too much at the same time. I thought in big-picture terms not in small, easy to understand bites. This led to hard-to-test code and extra work that provided no benefit to the end user.

When writing code I now try breaking things down into the smallest possible chunks. I do my best to focus on one thing at a time. Here are a couple of principles I use to help keep me on track:

  1. At least one commit per day,

    I regret trying to refactor a 1,000 line module in a single commit. Making a goal to commit the code I worked on during the day encourages me to limit the scope of my work. It helps me set smaller goals therefore fine tuning my focus for the day. It also feels great to make a commit and keeps up my spirits and motivation.

  2. No more than 3 days before merging a feature branch.

    This also helps me limit the scope of my work. Just because I have small commits doesn't mean that I'm not thinking too big. Adding a new feature, or refactoring a feature shouldn't take more than a couple of days. This helps me not dwell on something to long. It discourages distraction from other parts of the app that need changing. Even if the feature is large and needs hundreds of lines of changes, its possible to break that feature into smaller parts that you can refactor in separate branches. If you need some ideas for how to do this, check out the Mikado method to start.

  3. Write down distractions.

    One of the things that bloats my commits is distractions from tangentially related code. I often stop what I'm working on and refactor that piece right away. This can happen multiple times per feature. When I recognize my brain doing this this, I ask myself: 1) Is this part of or just related to the feature I am working on? 2) Is it a minimum requirement to change or add this thing?. Most of the time, its loosely related and / or it's not essential to update now. When this happens I'll add a single commit with a // TODO: ... comment line above the relevant code (I use git add --patch if it's in a file with changes already). This is a quick way of adding a reminder of something I want to do without going on a huge rabbit trail. If you don't like TODO comments, make sure you have a place to write down ideas that come to mind.

I hope this helps you working on code. Comment if you found this helpful or you want to challenge the concepts I brought up.

💖 💪 🙅 🚩
splrk
Splrk

Posted on April 13, 2020

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related

Learning to think small
lessonslearned Learning to think small

April 13, 2020