Thoughts about configure scripts and feature vectors
Cheng Shao
Posted on May 6, 2022
For a long time, I’ve been wanting to rant about stuff like configure scripts. They indirectly contribute a lot to my worktime headaches these days. Given I’m restarting personal blogging here, let’s see if I can turn that rant into a post.
What’s configure
Suppose you need to compile some software from a source tarball. The old unix tradition goes like ./configure && make && make install
. Nothing fancy about the make
part, but why is configure
needed in the first place?
Well, configure
is just a shell script that probes the build environment and generates some C header file to be included in the source code. Each project has its unique configure
script that probes for different things, headers, functions, any feature that the source code needs to check for build-time existence and provide a fallback code path when it’s absent.
Say that the source code needs to call the foo
function, which exists on just some of the project’s supported platforms. If configure
detects foo
, it’ll write a #define HAVE_FOO 1
line to the generated header. The source code can then include the auto-generated feature header, use CPP declarations like #if defined(HAVE_FOO)
to decide whether function foo
exists in the build environment.
configure
is typically auto-generated from a template using autoconf. In some projects it can be a hand-written python script. There are also build systems like cmake that take over configure
's role completely, probes the build environment on their own and generate the feature header.
Anyway, my rants here are only related to the idea of build-time feature detection, and irrelevant to how configure
is actually implemented (although that’s also annoying enough for it’s own blog post)
What’s a feature vector
How many HAVE_
macros do you have in your project?
~/ubuntu/ghc$ grep -rIF HAVE_ | wc -l
842
Wait a sec. Most of those should be mere duplications, for instance, HAVE_FOO
is very likely to occur in multiple source locations. One should really check how many features (headers, functions, etc) are checked by configure
.
~/ubuntu/ghc$ grep -rIF AC_CHECK_ | wc -l
171
The number above is a lower estimation, since in autoconf, a single AC_CHECK_HEADERS
or AC_CHECK_FUNCS
line can check multiple entities.
Now, we can introduce the concept of a “feature vector”: an N
-dimentional boolean vector, where N
corresponds to the number of things you’re checking at build-time. Each value of the feature vector is a point in the feature space, specifying a build-time configuration.
How large is the feature space?
- Definitely not as large as
2^N
. Most of the dimensions aren’t orthogonal, one can imagine clusters of things that either exist as a whole, or not exist at all. - Still, way larger than the the space where people actually test on CI to avoid bit-rotting.
My rant is the second point above.
Why the rant
- In GHC, one can pass various
configure
arguments to enable/disable features like unreg codegen, large address space, native IO manager, etc. The default configuration passes the test suite, but once you start messing withconfigure
config, expect failed test cases. At least those cases should be explicitly marked fragile/broken in those configurations! - In GHC,
unix
, probably other places I’ve hacked and forgotten: the API evolves, but people forgot to update the code in the#else
-guarded parts, because it’s not tested on CI, maybe that particular CPP-checked thing is thought to exist on all platforms. Well, WASI is a rather restricted platform, so those bitrotten parts all come back to bite you when I target WASI.
There’s nothing to blame for the need to write portable code and do build-time feature detection. We all know untested code is bad, but much fewer people are aware: untested feature vectors are also bad!
Also, this isn’t just a matter of “code coverage”. It’s perfectly possible to achieve a high coverage rate by testing against just a few feature vectors, while leaving the potentially broken build-time configurations in the dark.
How to solve it
Most software written with tons of #ifdef
lack the feature vector mindset, and don’t have the testing logic to:
- Perform
QuickCheck
-style random testing in the feature space. Generate a feature vector, run the tests against it, and “shrinking” is just moving the point closer to the known-to-work base point, the default config you get when configuring on a typical platform without any custom arguments. This allows discovering failures that arise from complex & unintended interactions between different dimensions in the feature vector. - Hide certain existing auto-detected features. This allows testing for restricted/exotic platforms, while still running the tests on a common platform. Not trivial to implement, especially for things in standard libraries, but it should be possible by using
poison
pragma, creatingcc
wrappers, or even isolated sysroots.
Posted on May 6, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
November 30, 2024
November 30, 2024