About maintaining internal libraries
Carlos Carbonell
Posted on February 7, 2021
There must be something very exciting about writing your own thing. I mean, I can’t say that I authored a prominent internal library at work, but hell I sure have contributed to them. I won’t argue about the advantages of using widely open source libraries whenever you can, but there’s something so sweet, so intimate about knowing a library from the inside out because you wrote it.
The thing is, there's a great amount of freedom when it comes to maintaining your own thing. However, that freedom comes with some responsibility. As your library grows in scope (because, they all naturally have to) and starts being used by more and more projects, you need to start worrying about finding a healthy balance between keeping existing functionality and introducing innovation so that you don’t trip over your own foot along the way.
I won’t expand much on the benefits a package manager has brought us for handling our internal libraries. Git submodules (the second best option) is no less than dependency hell. My head aches any time I try to remember my team lead scolding me for “trusting too much your IDE and not thoroughly checking the submodule repository changes”. Please don’t go that route for consuming your library, and now that we’re talking about it, don’t go that route for maintaining it either (we’ll talk about it later).
One of such advantages is that it gives space to all the projects that depend on your library to migrate at their own pace. Other being the ability to access earlier versions of the code base via a git tag for the purpose of debugging (Thanks Hamlet for implementing both of these features on your libraries, they have made my life so much easier).
The second, and most important thing, lies in the development of well grounded abstractions that allow good flexibility for extension. Honoring the Open Closed principle should be the gold standard for developing a library whenever you can. At least for some time, until you naturally need to break an existing API to make room for better, more robust code. Even in that case, every time you do this, it should be with utmost respect for your clients: the existing applications your library depends on.
You see, open source libraries have systems for enabling innovation while ensuring absolute respect for their clients. They have github issues, forums, and other forms of communication to ensure libraries don’t move too quickly. When you are an author of an internal library and have so much freedom with your library lifecycle, you might think it is not such a big deal to publish breaking changes on a somewhat regular basis just because you are versioning your library or you have the ability to migrate all existing clients yourself. If you are not careful you might end up forcing someone to migrate their project to your shiny new version because they had to use another mandatory library or client that depended on it (I have been such a person, more than once in the last year alone).
There are so many ways of introducing new functionality in a library that open so many ways of messing up that there’s no way I can cover everything that can happen. We are a small team and we have the awful habit of not documenting our libraries well enough, let alone write migration paths from version x to version y. The suggestions I make in this article take all these things into consideration.
Prepare for Change
You prepare for change by reducing the scope of your library public API as much as you can. In my first day working at Sigma I was shown an evolved version of the repository pattern for the first project we were working on. I was immediately blown away by the intricate amount interfaces that allow functionality like deletes to be shown or hidden at will for use at the clients. It was beautiful. Still is. However, it also opened the possibility of exposing many of the internal workings of that repository class (because some of them were an interface away from being visible to clients).
If you create an abstraction not intended for your clients to use, hide it. You are making them a favor by not opening unexpected behavior to them, and you are making a good service to yourself because you can always change it later.
Use your patterns
I come from a OO background, so naturally, I will give people OO solutions. Behavioral Design patterns like template, strategy and state can make your life a lot easier along the way (That is, if your language allows for them). You don’t have to send a callback if you can override an event, or switch a strategy, or add another state. This does make some parts of your libraries “harder to use”, since sometimes they won’t be immediately obvious to the casual observer, but will make your apis much harder to break.
There’s still people who think that having to override a method or an event means that there’s some functionality their library is lacking that must be implemented by yet another flag in their public API. Nothing further from the truth.
Don’t Rush It
The very first rule of adding functionality to your new library is: DON’T RUSH IT. Let the solution be implemented first in one of the clients that consumes it. Let it evolve and be refined there (the more refinement it gets early on in this sandbox, the more welcome the change will become when you export it later). This is why I don’t recommend working directly with the source code of both your client and the library you are maintaining.
This workflow started because I needed bug fixes and tweaks in the applications that I was working on that consume internal libraries that I didn’t personally author (again, I’ve written so few). I simply couldn’t wait till the author found some time to approve my pull request. Some of these PRs can take literally months to accept if you don’t bring them frequently enough in the daily scrum (or the team chat, or their personal chat, or maybe an email, or a facebook ad or two).
I saw the brilliance of giving oxygen to these new features and letting them take a life of their own in my personal projects before preparing PRs for the libraries they extend. Especially since it’s very normal that they change very quickly in their first few iterations. You don’t want to rush an early feature to a public library. If you are afraid someone might be thinking of the same thing, you can definitely bring it up in your daily scrum to let everyone know what you are doing and prevent the team from duplicating effort.
Another reason for doing so is to justify the feature in the first place. As Marijn Haverbeke once put it:
A useful principle is not to add cleverness unless you are absolutely sure you’re going to need it. It can be tempting to write general “frameworks” for every little bit of functionality you come across. Resist that urge. You won’t get any real work done, and you’ll end up writing a lot of code that no one will ever use.
Don’t you dare changing the value of a default parameter
They are called default parameters for a reason. I had a friend who nastily discovered after a long debugging session, that one default boolean parameter had changed for an internal data access library he recently updated. The author just commented: “I always explicitly declare my default parameters, so I see no problem”. My friend confessed he could have been convicted for murder that same day.
Let your clients go when the timing is right
This one is hard. It may require breaking up your “util” libraries into smaller, more stable and purpose focused libraries to reduce the likelihood of forced migrations (we haven’t done this). It might also require duplicating some code on your clients because some new functionality might be needed but the cost of a migration might be high (I personally have done it in the past out of fear). It might require that they make room for incremental migrations of your libraries (which you are already doing anyway) to prevent accrual of technical debt. Either way, forcing API uniformity for the sake of it stalls innovation and drags your whole team to the internet explorer era.
Communicate Breaking Changes
Use Semantic versioning, and at the very least, list what is being broken in every major release of your library. This allows people to devise a migration strategy effectively. Having people realize you broke their code via compiler errors is not cool. Having behavior that silently changes is straight up evil.
Conclusion
You see, code is really alive. And libraries are organisms. There’s only so much you can do with a library until it invariably needs a breaking change. The point to discuss is not whether you should make that change, but how to make it gracefully to reduce the impact of it on your clients. That’s why it is so necessary to care for your clients, to love them. In the end, they justify the purpose of your library at all.
I might write a follow up article to this topic, because there’s just so much to say about it. Or I might not. Time will tell.
Posted on February 7, 2021
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.