Tony Robalik
Posted on January 8, 2021
I call it my "Gradle apprenticeship." One job ago, I spent a year working at Gradle, and had direct access to some of the Greatest Build Minds of our generation. I returned from that mountaintop bearing a secret: it's all about the classpaths. In fact, everything is a classpath. Your dog? A classpath, composed of barks and drool and poop.
javac -classpath barks.jar:drool.jar:poop.jar Dog.java
Ok, that is maybe an exaggeration.1 But it goes to show how hard it is to overstate the importance of classpaths in Java and Java-adjacent development: they really are everywhere, and are of foundational importance. It took me four years to begin to understand them, and two more to write this post, so that you could spend seven minutes reading it while drinking your morning coffee and avoiding that analytics ticket glaring balefully at you from your Jira board.
Who this is for
If you write code in Java or any JVM language such as Kotlin or Groovy, and regardless of whether you target the JVM or Android, this post may be useful to you. Understanding classpaths can help you comprehend and avoid gnarly bugs and, insofar as it deepens your understanding of JVM languages, will help you to be a more effective programmer.
I will use examples from Android and Gradle because those are the domains I know best, and will make examples easier (for me). We will also take a look at some Gradle build scans, as they are a very good tool for visualizing classpaths.
What will be covered
This post focuses on the fundamentals of classpaths and classloading in Java, with future posts planned to talk specifically about classpaths at build-time, compile-time, and run-time.
What's a classpath?
Classpaths are how you tell an SDK tool2 (such as java
or javac
), or build tool (such as Ant, Maven, or Gradle) that wraps an SDK tool, where to find third-party or user-defined classes that are not extensions or part of the core JDK. (official docs)
Third-party classes are generally referred to as libraries or dependencies, typically available as jars. User-defined classes are those written by you and your team, i.e., your app. Contrast this with core Java platform classes such as java.lang.String
, which are part of the JDK and do not need to be specified on the classpath. (You can think of these platform classes as being on a boot classpath.)
As we will explore in more detail below, classpaths are just ordered lists of jar files and directories containing class files. Recalling our Dog.java
example from above, the classpath for that compilation operation is literally just the three specified jar files, in order: barks.jar, drool.jar, poop.jar.
Of course, as modern JVM developers, we rarely if ever interact directly with tools like javac
(the Java compiler); we use “build tools” to orchestrate our ever-more-complex builds. By the same token, we generally don’t specify a classpath directly; rather, we “declare dependencies.” Dependency management and resolution is an incredibly complex topic on its own, and one which we will mostly elide. We will only say this: when you declare a dependency in a gradle build script, the tool treats that as an instruction to resolve that dependency (often involving downloading from the Internet), with the ultimate result that one or more jar files representing that “dependency” ends up on a classpath. If you for example declare the following, taken from a typical Android project:
dependencies {
implementation 'androidx.appcompat:appcompat:1.1.0'
}
then the likeliest result3 of this is that the Androidx appcompat lib v1.1.0, represented as a jar file,4 ends up on your compile-time and runtime classpaths.5 Additionally, all of appcompat’s dependencies (as well as their dependencies), also called transitive dependencies, will be added to those same classpaths. Below is a portion of a build scan showing the top-level dependencies of appcompat:
As noted above, we will not be diving into the complexities of gradle’s dependency resolution engine, so if you want to understand more about those first three dependencies with the constraint
label, see the docs.
While the build scan shows the dependencies organized into a tree structure, when gradle actually invokes the relevant compile tasks, this tree is flattened into a simple list.
To summarize:
Declaring a dependency is functionally equivalent to adding one or more jars to one or more classpaths.
To understand how classpaths impact the Java runtime, we have to talk about class loading. Let’s do that now.
Class loaders load classes
For an excellent introduction to class loading and the ClassLoader
class, I recommend this tutorial by Baeldung. I will summarize the salient points here.
There are three built-in class loaders used by Java applications:
-
Bootstrap class loader. Used to load JDK classes such as
java.lang.String
andjava.util.ArrayList
. - Extension class loader. Used to load extension classes (outside of the scope of this post).
- Application or System class loader. Used to load third-party and user-defined classes. This is the class loader that is configured with the user-defined classpath.
You may also define custom class loaders at runtime.
The following simple example (adapted from the Baeldung tutorial) demonstrates the first and third types:
public class Main {
public static void main(String... args) {
System.out.println("Main class loader = " + Main.class.getClassLoader());
System.out.println("String class loader = " + String.class.getClassLoader());
}
}
Running this program produces this output:
Main class loader = sun.misc.Launcher$AppClassLoader@7852e922
String class loader = null
Note in particular that the String
class loader is displayed as null
, which indicates that it is the bootstrap class loader. As this class loader is written in native code, it has no representation as a Java class.
You can compile and run this program by executing
javac Main.java && java Main
.
These class loaders are organized in a hierarchy, with the bootstrap class loader functioning as the parent of the extension class loader, which is itself the parent of the application class loader. Class loaders delegate to their parents when asked to load some class. Those parents delegate to their parents, and so on. If a parent cannot load a class, the immediate child then attempts it, and so on, until eventually the class is loaded or a ClassNotFoundException
or a NoClassDefFoundError
is thrown, depending.
Every Class
instance has a reference to the ClassLoader
that loaded it, and which is retrieved by the Class.getClassLoader()
method.
The classpath defined by you is used to configure the application class loader, which uses that information to load classes requested by your program at runtime.
Classpaths are order-sensitive and tolerant of duplicate entries
Now we know how a classpath influences a Java or Java-adjacent application: it instructs the application class loader how to find 3rd-party libraries and user-defined classes. Classpaths, therefore, influence the Java runtime at a very deep level.
The interesting thing about Java is that it is surprisingly dynamic, to abuse a much-overloaded term. For example, a small change to the classpath can result in radically different behavior between two successive operations.
You can think of the classpath as an ordered list of elements that supply class
files, and/or jars, which are collections of class
and resource files. These class files are used for operations such as orchestrating a build, compiling, or running your project.
In most cases, removing an element from the classpath will simply cause the given operation to fail, but in other cases you might actually see different behavior! It is also critically important to understand that a classpath is order-sensitive. Let's consider our dog example again. What if instead of
javac -classpath barks.jar:drool.jar:poop.jar Dog.java
we had
javac -classpath drool.jar:barks.jar:poop.jar Dog.java
?
Maybe nothing would be different. If, however, drool.jar
and barks.jar
had overlapping class files (either because of mis-packaging, or an intermediate step in a larger refactor, or poor architecture, etc.), then in the first case we'd be using the class files from barks.jar
, and in the second case the class files from drool.jar
, and these might very well be different. Implicit in this is that classpaths are also tolerant of duplicate entries: SDK tools simply ignore every entry that duplicates one that appears earlier in the list.
Important. Gradle guarantees that classpath order is deterministic.
Classpaths are order-sensitive, and in the next post, we're going to exploit that with an example from the build domain.
Wrapping it up (for now)
In this post we learned what a classpath is; that it is used by the Java runtime to configure the application class loader; that class loaders are organized in a hierarchy with the boot class loader at the root; and that classpaths are order-sensitive, with profound implications for the runtime behavior of your programs (including the build program).
With these fundamentals under our belts, we can now start exploring more practical, real-world examples from the build, compile, and run domains. See you then!
Special thanks
Thanks to César Puerta for reviewing multiple drafts of this post and providing excellent and thorough feedback. Any mistakes, however, are my own!
Endnotes
1 Is it obvious I'm a cat person? up
2 This term of art is borrowed from the Oracle docs. up
3 I said it was complex, alright? up
4 I understand that appcompat is actually packaged as an aar, but one of the things that AGP does is unpack that aar and put the jar inside of it onto the classpath. up
5 We will discuss why it's on more than one classpath in a future post. up
Posted on January 8, 2021
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.