Destroy All Dependencies, or: why structural typing is awesome

crabmusket

Daniel Buckmaster

Posted on December 2, 2020

Destroy All Dependencies, or: why structural typing is awesome

When designing software, we strive to ensure each module, class, or function has a single responsibility. We want to be able to reuse small, focused pieces of code, and we want our tests to be clean and understandable.

But a piece of software as a whole usually has more than one responsibility! Sindre Sorhus, noted developer of small NPM modules, puts it succinctly in this comment:

You make small focused modules for reusability and to make it possible to build larger more advanced things that are easier to reason about.

Our modules should be small and focused, but the end goal is to integrate them into a larger whole. This means the modules that make up our applications must have dependencies between each other; they cannot live in blissful isolation.

This means that dependencies are a necessary evil in software development.

Because dependencies are a necessary evil, we try to keep control of them as much as possible. According to Sandi Metz in her excellent book:

Object-oriented design is about managing dependencies. It is a set of coding techniques that arrange dependencies such that objects can tolerate change ... In the absence of design, unmanaged dependencies wreak havoc because objects know too much about one another.

While Sandi is talking about object-oriented design in particular, the principle applies to all ways of designing programs. Carefully managing the dependencies of each module of code you write will lead to cleaner, less-tightly-coupled code that's easier to test and reuse.

The example in this article will use TypeScript in object-oriented style. We're going to look at a piece of code with some obvious dependencies and some subtle ones, and then we'll use a few well-known techniques to remove as many of those dependencies as we can. The concepts and tools we'll use are:

  • The single-responsibility principle
  • The dependency injection pattern
  • Duck typing
  • TypeScript's structural typing capabilities
  • Generics and the principle of parametricity

After our changes, we'll have made our module much more reusable and more robust to changes in the rest of the application it's part of.

If you want the TLDR, you can skip to the starting code and the final result

A note about static types

Part of the inspiration for writing this article was the idea that dynamically-typed languages have it easier than statically-typed languages when it comes to reusability. Because types don't have to be declared, imported, and pinned down, the code is unencumbered to do what needs to be done now, and to change in the future.

Type systems have come a long way in the last decade. New statically-typed languages have emerged and gone mainstream. Existing languages have gained new features.

In this post we'll end up writing code that feels very dynamic, even though it's statically-typed and safe. This is, I think, a real gift of TypeScript's type system, as you'll see.

A disclaimer about premature design

Before we dive into the code, I want to point out that it's possible to do too much design up-front. Premature abstraction, code that is too DRY, and imagining future requirements can all create more problems than they're worth.

But, that said, it's possible to write code that maintains more flexibility than less, even when all requirements aren't yet known. Different ways of coding the same behaviour can create fewer or more dependencies, without changing the level of DRYness or abstraction.

The advice in this post has two purposes:

  1. To introduce you to some techniques which result in fewer dependencies, which you can apply habitually when writing new code.
  2. To help you learn how to apply these techniques when changing requirements force you to break apart dependencies.

With that in mind, let's get started.

Breadth-first search with lots of dependencies

Take a look at this (incomplete) snippet of TypeScript code which implements breadth-first search on a graph:

import {Graph} from "./graph";

export function breadthFirstSearch(
  nodes: Array<string>,
  edges: Array<[string, string]>,
  startingNode: string,
  nodeToFind: string
): Array<string> {
  let graph = new Graph(nodes, edges);

  let result: Array<string> = [];
  let visited: Array<string> = [];
  let queue: Array<string> = [startingNode];

  while (queue.length) {
    let node = queue.pop();
    let nodesToVisit = graph.successors(node);

    // algorithm omitted for brevity
  }

  return result;
}
Enter fullscreen mode Exit fullscreen mode

I've omitted the meat of the BFS algorithm, but you can see the important parts, which are:

  • Creating a Graph from the nodes and edges we're given
  • Calling the graph's successor method, which returns the names of the nodes connected to the given node

The first thing we should do when looking at this code is ask ourselves: what dependencies does this code have? Have a think before reading on.

There's one obvious dependency: importing the graph.ts module at the top of the file. However, there are many implicit dependencies in this code, which take a little bit of thought to discern:

  • What data to pass to the Graph constructor (nodes and edges)
  • Where to find the Graph constructor (in graph.ts)
  • What type of nodes the graph stores (each node is a string)
  • The names of particular methods the Graph class has (in this case, just successors)
  • How to call those Graph methods (e.g. successors takes a single string and returns an array of them)

You might not be used to thinking of things like class names and method parameter lists as dependencies. But they represent knowledge about external code, just as much as a hardcoded file location does. These pieces of knowledge tie our code to that external code, and mean that if the external code changes, our code will have to change too.

Some of these are necessary dependencies, but others are incidental. Incidental dependencies can be caused by careless design, but can be remedied just as easily. Let's take the first step in reducing our code's dependency on the outside world: reducing the amount of responsibility it has.

The single-responsibility principle

The single responsibility principle, which is a staple of object-oriented programming but can be applied anywhere, encourages us to write code so that "a class should have only one reason to change."

What is the responsibility of our BFS function as it is currently written? If we were to describe it in one sentence, we might say it like this:

breadthFirstSearch creates a Graph from the given nodes and edges, and searches for a path between two nodes within it.

The presence of an and in that sentence indicates there is more than one responsibility. We need to find a way to remove the first responsibility. We'll find that in removing a responsibility, we also remove dependencies.

You might think that creating a Graph object is a mere implementation detail, of the same level of significance as creating the result array, for example. And you may certainly be right to think that! When looking at a snippet of code like this, outside the context of a wider application, it's hard to say. In every instance, you will need to think through what counts as a responsibility, what is a mere implementation detail, and where to draw the boundaries of your modules. My purpose here is to illustrate what may be, not to proscribe the optimal way to structure breadth-first search algorithms.

Let's use the dependency injection pattern to remove the responsibility for creating a graph. To do that, we'll change the code as follows:

import {Graph} from "./graph";

export function breadthFirstSearch(
  graph: Graph,
  startingNode: string,
  nodeToFind: string
): Array<string> {
  let result: Array<string> = [];
  // the rest remains unchanged
}
Enter fullscreen mode Exit fullscreen mode

We removed the nodes and edges arguments, instead accepting a graph of type Graph. Instead of creating a dependent object, the function now accepts it as an argument. By doing that, it has refused to take responsibility for creating the graph, even though it still depends on a graph existing at some point.

Let's look back at our list of dependencies and see how it's different now:

  • What data to pass to the Graph constructor
  • That graph is an instance of the Graph class
  • Where to find the Graph constructor
  • What type of nodes the graph stores (each node is a string)
  • The names of particular methods the Graph class has (in this case, just successors)
  • How to call those Graph methods (e.g. successors takes a single string)

We seem to have introduced a new dependency, in the process of trying to remove one: now our code knows that the graph parameter is an instance of a class - the Graph class in particular.

This is a much smaller dependency. We've traded a large piece of knowledge - how to use the Graph constructor - for a smaller piece: merely that the Graph constructor exists.

But we would still like to remove this dependency entirely, rather than only shrinking it, if we can. And it turns out: we can.

Duck typing with interfaces

Let's think for a moment about this new dependency we've introduced. It may be smaller than the original dependency, but it still has some troubling implications.

In a statically-typed language, declaring that graph has type Graph (where Graph is a class constructor) means that we can never pass anything into this parameter which is not an instance of the Graph class or one of its subclasses.

This might seem reasonable, but it does reduce the flexibility of our code. Object-oriented inheritance can be useful, but we shouldn't force it on users of our code without good reason to. Languages like C++ have typically used inheritance heavily to enable polymorphism, because they do not have alternatives. But in TypeScript, as in many modern object-oriented languages, we can do better.

We can apply a principle known as duck typing to remove the implicit dependency on the class hierarchy here. Instead of depending on a specific class constructor, we'll depend on an interface. The difference is that interfaces are not tied to any specific class hierarchy.

In duck typing, an object's suitability is determined by the presence of certain methods and properties, rather than the type of the object itself.

Let's create a new file called graph_interface.ts to help with this. We'll declare all the capabilities of graphs which our application needs in one location:

graph_interface.ts:

export interface GraphInterface {
  successors(node: string): Array<string>;
  predecessors(node: string): Array<string>;
  nodes(): Array<string>;
  edges(): Array<[string, string]>;
  // ... other graph methods
}
Enter fullscreen mode Exit fullscreen mode

And we'll modify our BFS module like so:

import {GraphInterface} from "./graph_interface";

export function breadthFirstSearch(
  graph: GraphInterface,
  startingNode: string,
  nodeToFind: string
): Array<string> {
  // the rest remains unchanged
}
Enter fullscreen mode Exit fullscreen mode

Now, instead of depending on the Graph class itself, we depend on the new interface. Any type can implement this interface, regardless of a class's lineage. For example, we might end up creating new graph types that look like some of these:

// This class of graph has no parent class
class SparseGraph implements GraphInterface {
  // ...
}

// This class is the child of a class from the JavaScript standard library
class AdjacencyMatrix extends Uint8Array implements GraphInterface {
  // ...
}

// This class is a child of our original Graph class
class DirectedGraph extends Graph implements GraphInterface {
  // ...
}
Enter fullscreen mode Exit fullscreen mode

We have strictly increased the reusability of our code, because any type can stick to the interface our code needs. This is an example of the duck typing pattern:

Let's do another check of our dependency list:

  • That graph is an instance of the Graph class
  • That graph conforms to the GraphInterface type
  • Where to find the GraphInterface type
  • What type of nodes the graph stores (each node is a string)
  • The names of particular methods the GraphInterface type has (in this case, just successors)
  • How to call those GraphInterface methods (e.g. successors takes a single string)

We've removed the Graph class dependency for good, and have modified the following dependencies to refer now to GraphInterface instead of Graph, but we've again introduced a new one dependency! (Will this nightmare ever end? Are we making progress, or just painting over the cracks in our shoddy design?)

As with the previous change we made, we've swapped a larger piece of knowledge for a smaller piece. The Graph class implied an inheritance hierarchy, but the GraphInterface does not. While numerically our dependencies have stayed the same, we've reduced the amount of knowledge they contain. This makes them more manageable.

But now, thanks to an important feature of TypeScript's type system, and the changes we've made up to this point, we can make a dramatic improvement.

Duck typing with structural types

Astute readers will have noticed that in the last section, I gave some examples of classes that looked like this:

class DirectedGraph extends Graph implements GraphInterface {
  // ...
}
Enter fullscreen mode Exit fullscreen mode

But in TypeScript, unlike most statically-typed languages, it's not necessary to declare implements GraphInterface. As long as a class is compatible with an interface at the point of use, TypeScript will be happy! We don't need to declare compatibility ahead of time.

This is known as structural typing. It's an extremely useful feature which we will now make use of. Structural typing contrasts to nominative typing, where the specific identity of each type is unique and significant. Read more about structural and nominative typing.

Because of structural typing, we don't have to depend on a shared interface defined elsewhere. Let's move the interface declaration right into our module, like this:

interface GraphInterface {
  successors(node: string): Array<string>;
  predecessors(node: string): Array<string>;
  nodes(): Array<string>;
  edges(): Array<[string, string]>;
  // ... other graph methods
}

export function breadthFirstSearch(
  graph: GraphInterface,
  startingNode: string,
  nodeToFind: string
): Array<string> {
  // ...

  while (...) {
    let nodesToVisit = graph.successors(node);
    // ...
  }

  // ...
}
Enter fullscreen mode Exit fullscreen mode

Now we don't import GraphInterface, but declare it where it's needed. This is an important ability in languages like TypeScript and Go.

Here, we can cross a dependency off our list and, thankfully, not introduce any new ones:

  • That graph conforms to the GraphInterface type
  • Where to find the GraphInterface type
  • What type of nodes the graph stores (each node is a string)
  • The names of particular methods the GraphInterface type has (in this case, just successors)
  • How to call those GraphInterface methods (e.g. successors takes a single string)

We no longer depend on the interface being found anywhere except right here where it's used. We could rename every other file in our application, and this module would not even notice.

This is another big step forwards, but you might be getting a niggling feeling that something is wrong when looking at the code above. Let's take a moment to address that niggle.

Increasing flexibility using the interface segregation principle

The niggle is probably coming from the fact that our newly-moved GraphInterface declares a bunch of methods that breadthFirstSearch doesn't use. For example, predecessors is never used.

This is a violation of the interface segregation principle, which suggests that "no client should be forced to depend on methods it does not use."

This phrasing (from the Wikipedia page) is not a precise fit for our code example, but the principle definitely applies. Fortunately, we can easily remedy this, and in doing so increase the flexibility of our module.

Let's revise the GraphInterface like so:

interface GraphInterface {
  successors(node: string): Array<string>;
}

export function breadthFirstSearch(
  graph: GraphInterface,
  startingNode: string,
  nodeToFind: string
): Array<string> {
  // ...

  while (...) {
    let nodesToVisit = graph.successors(node);
    // ...
  }

  // ...
}
Enter fullscreen mode Exit fullscreen mode

Now that the interface has shrunk to include precisely the functionality we need from it, it doesn't represent the entire concept of "graphness" any more, and we should probably rename it. (Luckily, this is safe to do, because the interface is not exported, so no external code could be depending on it!)

interface BreadthFirstSearchable {
  successors(node: string): Array<string>;
}

export function breadthFirstSearch(
  graph: BreadthFirstSearchable,
  // ...
}
Enter fullscreen mode Exit fullscreen mode

The interface is now acting as a declaration of the types of things that can be searched with our breadthFirstSearch function. For more exploration of this pattern, have a read of this great blog post which uses examples in Go, another structurally-typed language.

Let's take another look at our list of dependencies:

  • That graph conforms to the BreadthFirstSearchable type
  • What type of nodes the graph stores (each node is a string)
  • The names of particular methods the graph object has (in this case, just successors)
  • How to call those methods (e.g. successors takes a single string)

I have modified the final two dependencies slightly, and in doing so, have made the first dependency redundant. Because our BreadthFirstSearchable interface perfectly fits just the methods our code actually uses, the interface itself is insignificant; it's merely a convenient syntax to declare how we will use the graph object.

Take a breather

We've done extremely well so far, reducing our list of five weighty dependencies to three. Take a second to congratulate yourself!

If we stopped here, we could be proud of our work. We've successfully

  • excised a responsibility that our module shouldn't have had in the first place,
  • removed an import of an external file,
  • abolished the implicit dependency on a class hierarchy

and in doing so, have made the code clearer and more focused. A side-effect of declaring interfaces at the point of use is to have explicit documentation about what our breadth-first search module really needs in order to work.

Notice that the remaining dependencies seem much more difficult to consider removing. After all, how could we make a BFS algorithm that doesn't depend on knowing the name of the successors method? Could we take this too far?

Of the dependencies that remain, I would argue the final two are very reasonable dependencies to keep. They express the core meaning of the code. Breadth-first search depends on searching the successors of nodes in a graph. Each node may have zero, one, or many successors. These are fundamental facts about graph theory that we can't really get around. So it's unlikely we'll be able to remove these dependencies, no matter how clever we are.

However, the dependency on the string type can be dispensed of. If you want to try the Full Monty, read on to find out how.

Using generics to reduce knowledge

Our code currently knows that each node in the graph is a string. This looks like a single piece of knowledge. But, like the Graph class implies an inheritance hierarchy, the string type implies other knowledge:

  • Nodes can be compared using not just == and ===, but <, >, localeCompare, etc.
  • We can ask each node for its .length
  • We can call node.substring
  • Etcetera...

Built-in types like string usually bring a lot of knowledge with them, because they are designed to be feature-ful! Usually it's not a problem to rely on this knowledge. Built-in types change so infrequently (especially not in breaking ways) that it's very safe to leave these dependencies in your code.

However, knowledge of concrete types can sometimes reduce flexibility and reusability. An immediate example that comes to mind is that some graphs might have numeric identifiers instead of string-based identifiers.

Bearing in mind my advice at the beginning about not doing too much premature design and not anticipating requirements, let's explore the way we can use generic types to make this code more flexible if appropriate.

First, let's make an alias of the string type, for clarity:

type Node = string;

interface BreadthFirstSearchable {
  successors(node: Node): Array<Node>;
}

export function breadthFirstSearch(
  graph: BreadthFirstSearchable,
  startingNode: Node,
  nodeToFind: Node
): Array<Node> {
  let result: Array<Node> = [];
  let visited: Array<Node> = [];
  let queue: Array<Node> = [startingNode];

  while (queue.length) {
    let node = queue.pop();
    let nodesToVisit = graph.successors(node);
    // ...
  }

  return result;
}
Enter fullscreen mode Exit fullscreen mode

It's now easier to see where we must insert type parameters in order to remove the knowledge of the specific type. After removing the alias type Node and adding generic parameters, the code looks like this:

interface BreadthFirstSearchable<Node> {
  successors(node: Node): Array<Node>;
}

export function breadthFirstSearch<Node>(
  graph: BreadthFirstSearchable<Node>,
  startingNode: Node,
  nodeToFind: Node
): Array<Node> {
  // ...
}
Enter fullscreen mode Exit fullscreen mode

We've successfully crossed off a piece of knowledge, and have made our code more flexible in the process. This is what our knowledge list looks like now:

  • What type of nodes the graph stores (each node is a string)
  • The names of particular methods the graph object has (in this case, just successors)
  • How to call those methods (e.g. successors takes a single Node)

Parametricity is a cloud of unknowing

There's an esoteric concept known as parametricity which is typically talked about in functional programming. This article from Cambridge (PDF) puts it thus (with added emphasis):

Parametricity can be thought of as the dual to abstraction. Where abstraction hides details about an implementation from the outside world, parametricity hides details about the outside world from an implementation.

The use of a type parameter means our function knows less about the outside world, which is what we wanted. This has interesting implications.

Our code now can't use methods of the String class like substring, because Node could mean any type, depending on the caller of our code.

As described handily in this article using Scala for examples, this lack of knowledge limits the choices we can make when implementing code. We can no longer accidentally use node.length or any other specific methods or properties. We are prevented from peering beyond the cloud of unknowing created by the generic type parameter.

(Of course, because JavaScript has reflection, we can determine information about values with unknown types at runtime. However, typeof and instanceof are more likely to be asked about in code review than an innocuous node.length.)

But, significantly, the caller now also knows that our code is operating inside this cloud of unknowing. The caller knows that our code cannot rely on Node being any specific type. This gives the caller more freedom and flexibility.

Recapping our journey

Here's the code we've ended up with:

interface BreadthFirstSearchable<Node> {
  successors(node: Node): Array<Node>;
}

export function breadthFirstSearch<Node>(
  graph: BreadthFirstSearchable<Node>,
  startingNode: Node,
  nodeToFind: Node
): Array<Node> {
  let result: Array<Node> = [];
  let visited: Array<Node> = [];
  let queue: Array<Node> = [startingNode];

  while (queue.length) {
    let node = queue.pop();
    let nodesToVisit = graph.successors(node);
    // the rest of the algorithm
  }

  return result;
}
Enter fullscreen mode Exit fullscreen mode

And here is our list of knowledge that this code has about external code:

  • The names of particular methods the graph object has (in this case, just successors)
  • How to call those methods (e.g. successors takes a single Node)

We've come a long way, and reduced our knowledge to the very minimum. The remaining dependencies reflect the core abstractions of our domain. This module should last unchanged for a long time, serenely weathering changes in the code surrounding it, without demanding to be changed in lockstep.

While it might seem like we might have looked into the future to imagine requirements this code might be fulfilling (e.g. new graph classes, or graphs with numeric nodes), the changes we've made were low-impact and broadly applicable to many situations. We didn't add new features, or introduce new abstractions: we systematically removed knowledge from the code, and in doing so made it less dependent on the code around it, and more reusable in unexpected future contexts.

I hope that next time you're writing new code, or refactoring a tangled web of dependencies, these techniques and ways of understanding dependency will help you cut through the chaos and discover clean, single-responsibility modules.

💖 💪 🙅 🚩
crabmusket
Daniel Buckmaster

Posted on December 2, 2020

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related