Surprising behaviors of SBT

devkaoru

Kaoru

Posted on November 17, 2019

Surprising behaviors of SBT

I recently learned about an interesting behavior of SBT while working on migrating my company's monorepo from SBT to Bazel. Why we're migrating deserves a blog post of its own (coming soon). The fundamental problem is: what does your build tool do when it encounters two versions of the same library? The problem may seem trivial at face value, but consider the fact that libraries do not exist in isolation. Resolving conflicting library versions becomes exponentially worse when taking into consideration transitive dependencies.

Helpful definitions

  • direct dependency: a dependency your project has on an external library.
  • transitive dependency: a dependency which your direct dependency depends on. For example, I depend on my cats for unlimited cuddle time, my cats depend on cat food. Thus, I have a transitive dependency on cat food.
  • compile-time: an environment which a program, the compiler (i.e. javac or scalac), run to convert your human-readable code into java byte code.
  • runtime: the environment which your java-byte code runs in.

Latest and greatest

The default behavior of SBT, version 1.3.0, is to defer to Coursier, the library which manages dependencies for your project. To be honest, SBT's documentation leaves you alone in a desert of the internet to understand the actual behavior. I found this Scala blog post which sheds light on how to control the behavior, but again does not really go into detail on what the actual behavior is. Eugene Yokota, a maintainer of SBT, thankfully, explained in detail what the behavior is in this blog post. The TLDR: Coursier will choose the latest version. Good enough right?

Devil in the detail

In this Github repo we have a core library is using CheckedFuture, an interface in com.google.guava::guava27.1-jre. A project, hello is using core and pulling in com.google.guava:guava:28.0-jre. You might be thinking, "well, obviously, the compiler will catch this and throw an error because core is using a class which was removed in com.google.guava:guava:28.0.0", and you'd be wrong.

Why does this happen? I assume when SBT runs scalac to compile core, it will include all of the necessary jars in the classpath to compile core correctly. Once core is compiled correctly, SBT will then move to hello. When compiling hello SBT won't recompile core, thus avoiding the compile-time guard to make sure CheckedFuture is available in the classpath. This example is similar to how Bazel behaves, so there's no difference there. I can only speculate the reason this happens is to avoid having to compile the entire world. The result, your code will break during runtime, aka Production. Hopefully you'll catch the error when running tests, otherwise, you'll find NoClassDefFoundError or NoSuchMethodError in your logs.

Problems with Abstraction

SBT and Coursier basically deal and hide version conflicts from you, the SBT user. While, yes you can control the rules which it determines how to choose dependency versions, the tool itself does not allow you to detect conflicts by default nor does it give you the proper mechanisms to resolve them. What's the result? Your system crashes or throws a 5xx caused by a NoSuchMethodError. NoSuchMethodError inherits from Error which according to the Java documentation:

...indicates serious problems that a reasonable application should not try to catch.

Bazel, the clearly opinionated

Bazel aims to work with monorepo giving it the freedom to have opinions on certain things. One opinion is to make sure there is only one version of an external library. Bazel forces..err asks you to list all your dependencies in one logical location, which Bazel calls the repository. From the repository, you will then tell Bazel which dependencies (without version) your project will use. For example:

java_library(
  name = "core",
  dependencies = [
    "@maven//com_google_guava_guava" # Note the omission of the version.
  ]
)

When Bazel detects conflicts due to either direct or transitive dependencies it will, by default, choose the highest version. Here's the difference though: Bazel will then place the highest version of the jar when it compiles all projects. Thus avoiding the issue we saw with SBT, as we won't be able to compile the core project. Problem solved, let's go home.

Well actually...

Unfortunately, this does not solve all the issues you might see. Specifically, Bazel does not solve the problem when two transitive dependencies are using two different versions of the same library. Let's break this down. Say you have a project which uses io.grpc:grpc-protobuf:1.25.0 and com.fasterxml.jackson.datatype:jackson-datatype-guava:2.10.1. Jackson-datatype-guava depends on com.google.guava:guava:20.0 and gRPC-Protobufs uses com.google.guava:guava:28.1-android, Bazel will compile and package your project's JAR with com.google.guava:guava:28.1-android, but it won't recompile Jackson-datatype-guava (cause it doesn't have to). If a user hits a code path that uses a method of Jackson-datatype-guava which relies on a class or method present in guava:20.0 but removed in guava:28.1-android, you just a lost a user, or maybe much worse!

Fortunately, if you catch this during testing it can be easily solved by Bazel or SBT, but that's a BIG if. I feel Bazel, or more specifically, rules_jvm_external, has the ability to address this by explicitly calling out these conflicts. I have opened an issue on rules_jvm_external to hopefully add a feature which makes it more clear when transitive conflicts occur. Until then, we're left to manually looking out for these changes, write comprehensive tests (you should be doing this anyway), and hope our users don't do weird things.

💖 💪 🙅 🚩
devkaoru
Kaoru

Posted on November 17, 2019

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related

Surprising behaviors of SBT
builds Surprising behaviors of SBT

November 17, 2019