Ergonomic Trees in Go

sean9999

Code Monk

Posted on February 25, 2024

Ergonomic Trees in Go

While working on a recursive file-watcher I came upon an elegant way to express trees in Go. This approach works well in cases where each sibling node must have a unique value and that value cannot be the zero value (as in a filesystem). In practice, this is most trees.

A tree is a data structure that contains data in a hierarchy whereby one parent has as many children as it wants, and those children can have children, and so on.

Let us imagine a folder named "Primates", containing sub-folders, which themselves contain more sub-folders and/or text files:

.
└── Primates
    └── Apes
        └── Great Apes
            ├── African Apes
            │   └── Gorillas
            │       ├── Eastern Gorilla.txt
            │       └── Western Gorilla.txt
            ├── Chimpanzees
            │   ├── Bonobo
            │   └── Chimpanzee
            ├── Homo Sapien.txt
            └── Orangutans
                ├── Bornean Orangutan.txt
                └── Sumatran Orangutan.txt
Enter fullscreen mode Exit fullscreen mode

In terms of behaviour, strictly speaking, a tree needs only two methods (although more are often provided for convenience and optimisation).

type Node[T comparable] interface {
    Data() T
    Children []Node[T]
}
Enter fullscreen mode Exit fullscreen mode

Where T is the data we care about. In the example above, T is string, and that string represents the folder or file name. So we might begin to build our tree of life like so:

lifeTree := createTree[string]()
primates := lifeTree.AddChild("Primates")
apes := primates.AddChild("Apes")
// etc...
Enter fullscreen mode Exit fullscreen mode

But how do we power this data structure, in terms of a concrete type? The canonical way is to use a struct.

type node[T comparable] struct {
    data T
    children []*node[T]
}
Enter fullscreen mode Exit fullscreen mode

This is perfectly fine. But there is a more ergonomic way, made possible because of two important characteristics of maps in Go:

  1. a map can be defined recursively
  2. the zero-value for a map is nil
// recusively defined node
type node[T comparable] map[T]node[T]
Enter fullscreen mode Exit fullscreen mode

Here, we're defining a map type whose keys are the values we care about, and whose values are... other maps whose keys are values we care about.

Remember that Node is the only object we need to define, because a tree can be said to be it's root node.

Keys of maps need not be simple strings or ints, but they need to satisfy the comparable type constraint.

Leaf nodes (in our example, text files) are nil maps. They terminate recursion. A recusive data structure is not useful without a base case. We do not wish to swallow the universe.

So how might we implement the Node[T] interface using the node[T] concrete type? Let's look at a basic interface, and then dive in.

type Node[T comparable] interface {
    Data() T
    Children map[T]Node[T]
    Set(T)
    Get(T) Node[T]
    AddChild(T) Node[T]
    RemoveChild(T)
}
Enter fullscreen mode Exit fullscreen mode

Children()

This is the simplest. A node is precisely defined as it's children, so:

func (n *node[T]) Children() map[T]Node[T] {
    return n
}
Enter fullscreen mode Exit fullscreen mode

Set(T)

This will be used to set leaf nodes. To set non-leaf nodes (branches) we'll be using AddChild()

func (n *node[T]) Set(key T) {
    n[key] = nil
}
Enter fullscreen mode Exit fullscreen mode

Remember, key is meaningful here. it's not a lookup key. It's the data we care about.

AddChild()

Here's where we get to define our hierachy and enable actual tree behaviour

func (n *node[T]) AddChild(key T) Node[T] {
    branch := new(node[T])
    n[key] = branch
}
Enter fullscreen mode Exit fullscreen mode

Data()

This is a little bit trickier. Since a Node is entirely defined as the map that represents it's children, where does it's Data(), the data we care about, come from?

The answer is that the Data() of a node is the key of parent that contains that node. This makes perfect sense when you think about a filesystem. A folder name only makes sense in the context of where it sits in the hierarchy. The uniqueness constraint of it's name comes from it's parent. To change a folder name is to change nothing about the folder itself and it's behaviour, but it does change how it's parent addresses it.

Here is where you begin to see that it would be useful (though strictly speaking not necessary) for a node to have a link back to it's parent. To use our folder example, we would like to have our special ".." file in our directory listing. It makes traversing the filesystem way easier, and certain operations way more efficient.

But how can this be done using a map? According to the Go Proverbs, we should Make the Zero Value Useful. Remember that we do not allow our users to use the zero-value (for example, a file or folder name cannot be blank). We will use that special value for ourselves.

So then, a constructor function might now look like this:

func New[T comparable](parent Node[T]) Node[T] {
    var zeroK K // auto initialized to zero value
    branch := new(node[T])

    // tell the child where the parent is
    branch[zeroK] = parent

    return branch

}

// the root node has no parent
myExampleTree := New[string](nil)
myBranch := myExampleTree.AddChild("some branch")
Enter fullscreen mode Exit fullscreen mode

... requiring us to augment certain methods:

func (n *node[T]) AddChild(key T) Node[T]) {

   // tell the parent where the child is
   n[key] = New[T](n)

}
Enter fullscreen mode Exit fullscreen mode

In my view, this is an elegant approach to building trees satisfying the two aformentioned constraints (must be comparable, can't be zero). I wrote a more full-featured implementation of this that includes the following method set for common efficient tree operations.

type Node[K comparable] interface {
    Data() K
    Children() map[K]Node[K]
    Spawn(K) Node[K]
    RemoveChild(K)
    Parent() Node[K]
    Walk() [][]K
    Ancestry() []K
    IsTerminal() bool
    Set(K)
    Get(K) (Node[K], bool)
    SetParent(Node[K])
    fmt.Stringer
}
Enter fullscreen mode Exit fullscreen mode

Edward Hitchcock Paleontological Chart

Use it: https://pkg.go.dev/github.com/sean9999/go-ergonomic-tree#section-readme

Fork it: https://github.com/sean9999/go-ergonomic-tree/tree/v0.0.1

💖 💪 🙅 🚩
sean9999
Code Monk

Posted on February 25, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related

Ergonomic Trees in Go
ergonomic Ergonomic Trees in Go

February 25, 2024