Automating version number updates: what could go wrong?
Dimitri Merejkowsky
Posted on June 11, 2020
Automating version number updates: what could go wrong?
Say you need to update (bump) your software. It’s currently at version 1.2, all the required changes have been merged, and it’s time to publish version 1.3. That’s really easy, right? Change the version in one file, commit, tag, and push. Done!
I thought that too once, but the truth is that it’s harder than it looks — let me tell you my story.
Releasing software for Softbank robotics
Our story begins in 2008, when I was release manager at Softbank Robotics.
I had two big projects to release: Choregraphe, a desktop GUI to control the NAO and Pepper robots, and qiBuild, a command-line application to ease C++ development.
For qiBuild, I had three files to patch because the code version number was hard-coded in several places (setup.py
, __init__.py
, and docs/conf.py
).
On the other hand, for Choregraphe the version number was hard-coded in the top CMakeLists.txt
file only, but there was quite a bit of code to “forward” the version number from the top CMake
file down to the version.hpp
and main.cpp
files.
Both solutions had their pros and cons but I could not decide which one was best. Since I was pretty sure I was not the first one to have encountered this issue, I started to look around for better solutions.
Trying out bumpversion
The way bumpversion works is a nice combination of the two approaches I’ve mentioned before:
First, keep the hard-coded version number in as many files as required
Second, add a dedicated configuration file (
.bumpversion.cfg
) containing the current version and the aforementioned list of files.
Then, when bumping the project from version X to version Y:
Iterate over the file list
Replace all occurrences of X with Y, including in the configuration file itself.
So, I started with the qiBuild project and added the following configuration file:
# in .bumpversion.cfg
[bumpversion]
current_version = 1.0.1
commit = True
tag = True
[bumpversion:file:setup.py]
[bumpversion:file:qiBuild/__init__.py]
[bumpversion:file:docs/conf.py]
Since I had to bump qiBuild from 1.0.1 to 1.0.2, I ran:
bumpversion patch
A commit was automatically made along with a matching tag:
$ git show
commit ec50897893ce4ecfb1debaa1df266ae4c555f45b (HEAD -> master, tag: v1.0.2)
Author: Dimitri Merejkowsky <dimitri.merejkowsky@tanker.io>
Date: Wed May 20 16:58:27 2020 +0200
Bump version: 1.0.1 → 1.0.2
diff --git a/.bumpversion.cfg b/.bumpversion.cfg
-------- a/.bumpversion.cfg
+++ b/.bumpversion.cfg
[bumpversion]
-current_version = 1.0.1
+current_version = 1.0.2
commit = True
tag = True
diff --git a/qiBuild/__init__.py b/qiBuild/__init__.py
-------- a/qiBuild/__init__.py
+++ b/qiBuild/__init__.py
-__version__ = “1.0.1”
+__version__ = “1.0.2”
diff --git a/setup.py b/setup.py
-------- a/setup.py
+++ b/setup.py
setup(
name="qiBuild",
- version="1.0.1",
+ version="1.0.2",
)
diff --git a/docs/conf.py b/docs/conf.py
-------- a/docs/conf.py
+++ b/docs/conf.py
project = "qiBuild"
- version="1.0.1",
+ version="1.0.2",
Great! Now I just had to run python setup.py sdist upload
and the 1.0.2 release was published on pypi.org.
For the NAOqi project, it worked very well too.
I could delete a bunch of CMake and preprocessor code and replace it with just one line of C++ code:
// in naoqi/version.hpp
static std::string const NAOQI_VERSION = "2.3.0";
This time the .bumpversion.cfg
file looked like this:
[bumpversion]
current_version = 2.3.0
files = include/naoqi/version.hpp
Now bumping naoQI’s version could be done with a command similar to the one used to bump qiBuild.
Brilliant! So, what went wrong?
Wandering off-track
The problem can be described as being “too smart by half” and has to do with the command line syntax of bumpversion.
Indeed, bumpversion is smart enough to bump various “parts” of the version number, namely the major, minor, and patch components used in the semver spec.
Here are some examples, assuming that the current version is 1.2.3:
bumpversion patch : 1.2.3 -> 1.2.4
bumpversion minor : 1.2.3 -> 1.3.0
bumpversion major : 1.2.3 -> 2.0.0
We were using semver for qiBuild and NAOqi too at the beginning — but sometimes semver is not enough.
Let’s continue our story. When qiBuild got more usage and the software team grew, publishing qiBuild releases started becoming… scary.
New features were added, refactorings were made and qiBuild had become an essential tool for all the developers in the software team (100 of them) — I was getting nervous about what would happen if I shipped a buggy release.
So, with the help of members of my team, I decided to start making release candidates. That way, a few brave colleagues could help me catch bugs before everyone upgraded to the latest stable version.
Since Python developers had already come up with a version scheme that allowed for release candidates, I started using that. See PEP 440 for details. Basically, I could add a rcX suffix after the patch part.
And… it turned out that doing so was far from trivial. Why?
Well, bumpversion assumes you are using semver and, if you don’t, you need to specify a custom regex:
parse = (?P<major>\d+)
(?P<minor>\d+)
(\.(?P<patch>\d+))? .
((?P<release>rc)(?P<rel_num>\d+))?
But it does not stop there.
Now you need a way to tell bumpversion how to go from 2.0.0rc1 to 2.0.0rc2 (if the release candidate had bugs) or from 2.0.0rc2 to 2.0.1 (if it did not).
So you add even more configuration, but it’s still not enough:
serialize =
{major}.{minor}
{major}.{minor}.{patch}
{major}.{minor}{release}{rel_num}
[bumpversion:part:release]
values =
a
b
rc
You can find all the gory details in the GitHub issue but, in short, it was a show stopper.
I believe that bumpversion is still not used at Softbank Robotics to this day, and, as far as I know, bumping qiBuild still needs version numbers to be fixed manually in several files.
But our story does not end here — that would be a rather sad ending!
Arriving at Tanker
In March 2016, I handed in my resignation letter at Softbank and, three months later, I started my current job at Tanker.
In short, Tanker sells a product that can be used for end-to-end encryption, as well as client-side anonymization, packaged in open source SDKs (see our website for details).
Anyway, given we were releasing software and, because of my past experience, I was also in charge of release management at Tanker.
As you might expect, Tanker also had hard-coded version numbers and chunks of code whose only role was “forwarding” the version number from one file to another.
“This is exactly the same problem I had last time!”, I thought. So, I took a look at bumpversion again — but even after all this time the bug I opened was still not fixed.
That’s when I realized there were two big problems with bumpversion which would be pretty hard to fix without rewriting a lot of code.
First, it’s very hard to reliably identify “parts” of version numbers. Semantics can vary from one version scheme to the next. Even comparing version numbers is a hard task, but guessing how to bump from a release candidate to a stable one is near impossible.
Secondly, there are some hidden defaults at play which make understanding what’s going on under the hood pretty hard.
In other words, bumpversion was “too clever by half”.
So, what to do? Well, rewrite from scratch of course! (It turned out to be a good idea after all — otherwise, I would not brag about it here :P)
The birth of tbump
On December 7, 2017, I started working on a rewrite called tbump. My goal was to keep bumpversion’s good ideas and to fix its shortcomings.
Here are the main differences between tbump and bumpversion:
There are no hard-coded defaults: you must specify how the git message and the tag name will be formatted, and you also need to specify a regular expression to define the version scheme.
Instead of specifying what part of the current version you want to update, you need to pass the whole new version.
Back to our example — to go from 2.1.3 to 2.1.4 you run tbump 2.1.4
instead of bumpversion patch
.
Those differences come with a price.
First, since there is no hard-coded default it’s harder to use tbump out of the box.
However, this one was easy to fix : I added an init
command to generate a tbump.toml
file automatically. Instead of having to read the docs, users can read the generated file and get started quickly.
Secondly, since one has to specify the new version instead of a segment one wants to bump it’s easy to make mistakes, like going from 1.0.3 to 1.0.5 instead of 1.0.4.
That’s where it gets interesting.
You see, I was pretty annoyed by some aspects of the bumpversion UX, especially when trying to tweak the configuration file.
Just watch:
$ bumpversion patch --dry-run
<nothing>
$ bumpversion patch --verbose --dry-run
current_version=1.0.2
commit=True
tag=True
new_version=1.0.3
Now look at what tbump --dry-run
does:
$ tbump --dry-run 1.0.3
:: Bumping from 1.0.2 to 1.0.3
=> Would patch these files
- setup.py:3 version="1.0.2",
+ setup.py:3 version="1.0.3",
- foo.py:1 __version__ = "1.0.2"
+ foo.py:1 __version__ = "1.0.3"
- tbump.toml:2 current = "1.0.2"
+ tbump.toml:2 current = "1.0.3"
=> Would run these git commands
$ git add --update
$ git commit --message Bump to 1.0.3
$ git tag --annotate --message v1.0.3 v1.0.3
$ git push origin master
$ git push origin v1.0.3
The new version is all over the place, you can’t miss it, and at the same time, you see exactly what is going on.
The output is similar without the — dry-run option, except changes are actually applied and git commands are run.
And then I realized that every time I was bumping something, I would be following the same pattern:
Run
tbump $NEW_VERSION --dry-run
Check if everything was OK by reading the output
If not, back to Step 1
Otherwise, run
tbump $NEW_VERSION
This looks like an algorithm — and what do we do with algorithms? We implement them!
So here’s what the entry point of tbump looks like:
# (simplified)
def main():
bump(dry_run=True)
answer = input("Looking good (y/n)?")
if answer != "y":
sys.exit("Canceled by user")
else:
bump(dry_run=False)
That way you get a chance to catch any mistake in the new version just before it’s too late :)
Early adoption
In January 2018 tbump 1.0 was out (and I had been using tbump to bump itself since the 0.2 release).
I published the package on pypi.org and I started using it both for Tanker projects and my own ones.
It worked great! My fellow developers told me they liked the UX so I was pretty happy.
But there was a critical flaw in tbump’s design that would come back to bite us pretty hard in the future. Let’s see how.
yarn workspaces
Quite early in the development of our Javascript SDK, we knew we’d be using a mono-repo.
For instance, we wanted to support both Node.JS (for fast-running tests) and the browser (which is the primary use of the SDK).
So right off, we knew we would have at least three packages: @tanker/core
, @tanker/client-browser
, and @tanker/client-node
.
It did not make sense to have different version numbers for those three packages, so here’s what we ended up with:
// In core/package.json
{
"name": "@tanker/core",
"version": "1.2.0",
// ...
"dependencies": {
"libsodium-wrappers": "^0.5.1",
// ...
}
}
// In client-browser/package.json
{
"name": "@tanker/client-browser",
"version": "1.2.0",
// ...
"dependencies": {
"@tanker/core": "1.2.0",
// ...
}
}
// In client-node/package.json
{
"name": "@tanker/client-node",
"version": "1.2.0",
// ...
"dependencies": {
"@tanker/core": "1.2.0",
// ...
}
}
When bumping to a new version, we needed to patch the line that contains the “version” key in the top metadata:
[[file]]
src = "packages/core/package.json"
search = '"version": "{current_version}"'
[[file]]
src = "packages/client-node/package.json"
search = '"version": "{current_version}"'
[[file]]
src = "packages/client-node/package.json"
search = '"@tanker/crypto": "{current_version}"'
Note the search
line: we did not want to blindly replace all occurrences of the new version in packages.json — if a line declares a third-party dependency that happens to have the same version number as the current one, it should not get patched!
Speaking of dependencies, we also needed to patch the line that specifies the version of @tanker/core
used by @tanker/client-browser
and @tanker/client-node
:
[[file]]
src = "packages/client-browser/package.json"
search = '"@tanker/core": "{current_version}"'
[[file]]
src = "packages/client-node/package.json"
search = '"@tanker/core": "{current_version}"'
That’s a total of four blocks of configuration. Then we extracted a @tanker/crypto
package from @tanker/core
, and added two new blocks of configuration:
[[file]]
# Bump version of @tanker/crypto
src = "packages/crypto/package.json"
search = '"@tanker/core": "{current_version}"'
[[file]]
# Bump version for @tanker/core too — it depends on @tanker/crypto!
src = "packages/core/package.json"
search = '"@tanker/crypto": "{current_version}"'
Unbeknownst to us, that was the start of a slippery slope: every time we added a new package in the workspace, we’d have to add two blocks of configuration for this package, and one for every package that depended on it.
This is a famous anti-pattern, and before you can say “I see a polynomial complexity!”, we ended up with this kind of monstrosity: a 200 hundred lines configuration file!
Saved by my co-workers
Luckily I’m working with a team whose members never hesitate to share (constructive) criticism — and who often take matters into their own hands.
So, when faced with the task of trying to edit this gigantic configuration file just to add a new package, one of them decided to fix the root cause of the problem and submitted a pull request for tbump named “Stars and Dots”. Here are the two main changes it implemented:
Allow regular expressions in search strings: we could now use
@tanker/.*
instead of specifying the whole name of the dependency (@tanker/core
,@tanker/crypto
, and so on)Allow using glob patterns to specify file names. Instead of having one block per file, we could now specify a whole bunch of files using
packages/*/packages.json
as a glob pattern.
Clever name for a clever pull request, don’t you think?
This pull request was of, course, quickly reviewed and merged by yours truly, and all the nasty blocks in the tbump.toml
file were replaced with just two:
[[file]]
src = "packages/**/package.json"
search = '"version": "{current_version}"'
[[file]]
src = "packages/**/package.json"
search = '"@tanker/[^"]+": "{current_version}"'
Note that it remains more or less the same to this day, even if the sdk-js git repository now contains about 30 different packages :)
The cherry on the cake
Before I conclude, I’d like to mention a feature of tbump called hooks. It’s the ability to run arbitrary commands before patching the files or after the commit is made and pushed.
Hooks can run before the files are patched and the commit is made, or after the new commit and new tag have been pushed and are defined in the tbump.toml
file.
You can find examples of this in tbump’s own configuration file:
[[before_commit]]
name = "Check Changelog"
cmd = "grep -q {new_version} Changelog.rst"
[[after_push]]
name = "Publish to pypi"
cmd = "tools/publish.sh"
Conclusion
tbump is now the deployment tool for Tanker.
We often use “before commit” hooks to perform various checks (like verifying that the version to be published is mentioned in the changelog), or to make sure that generated files are up-to-date (like yarn.lock
for instance).
We also use “after push” hooks as “executable documentation” to specify how to publish new releases (like using poetry publish
in case of a Python package).
So, what did we learn?
First, pay attention to algorithmic complexity, even in configuration files, take care in designing a good UX, even for command-line tools,
Second, if you can afford an extra pair of eyes to a given problem, don’t hesitate to ask.
Finally, there’s a battle-tested rewrite of bumpversion with a pretty good UX and nice features waiting for you on pypi.org. Feel free to try it out and tell us what you think.
Cheers!
PS: We’re hiring software engineers. If the way we work sounds interesting, have a look at the hack challenge available on https://www.tanker.io/rabbit/ and don’t hesitate to reach out!
Posted on June 11, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.