Generalizing Tree Traversal in Haskell with Typeclass
Yufan Lou
Posted on June 9, 2020
Tree is the most pervasive data structure in our lives. First, we see trees, green literal trees, everywhere. Second, almost any navigation menu is a tree. Finally, our code always becomes an abstract syntax tree before it goes on to become executable.
Haskell is a very interesting language. The creators of Haskell intentionally chooses to limit it to what mathematics can do, and surprisingly reveals how much mathematics can do. This post introduces what mathematics is best at: generalization. You think interface is generalization? Nah, this is generalization.
To start, hopefully you are familiar with the tree data structure:
data Tree =
Leaf Int
| Node Int Tree Tree
Let's construct some simple trees as a warm up:
ex1 = Leaf 1
ex2 = Node 2 (Leaf 1) (Node 3 (Leaf 0) (Leaf 5))
When we traverse a list, we can go from front to back or back to front. For trees, there are three orders to traverse it: pre-order, in-order, and post-order. Each would let us visit the nodes in a different order, which we can represent with a list of the nodes.
But first let's construct the tree we'd like to play with! I'll borrow the example from this StackOverflow answer.
ex3 = Node 7 (Node 1 (Leaf 0) (Node 3 (Leaf 2) (Node 5 (Leaf 4) (Leaf 6)))) (Node 9 (Leaf 8) (Leaf 10))
Pre-order is so named because the root node is visited before both of the subtrees, thus the "pre-" prefix. Have a look at its code:
preorder :: Tree -> [Int]
preorder (Leaf a) = [a]
preorder (Node a l r) = [a] ++ preorder l ++ preorder r
The Leaf a
case is the natural base case. In the Node a l r
case, a
is the subroot at the relative top, l
is the left subtree, while r
is the right subtree. Top to bottom, so a
is the first. Left to right, so l
is before r
.
>>> preorder ex3
[7,1,0,3,2,5,4,6,9,8,10]
Now that's all good, and we have a list of nodes in the traversal order. We can take this list and process it further, such as summing it up:
sumOfTree :: Tree -> Int
sumOfTree = sum . preorder
>>> sumOfTree ex3
55
or calculating a cumulative sum:
cumulativeSum :: [Int] -> [Int]
cumulativeSum = scanl1 (+)
cumulativeSumOfTree :: Tree -> Int
cumulativeSumOfTree = cumulativeSum . preorder
scanl1
builds a list by applying a binary function to an existing list from left to right, taking the first element of the list as the initial value, and recording results from each step. With (+)
as the binary function, the resulting list is the cumulative sum of the input list.
>>> cumulativeSumOfTree ex3
[7,8,8,11,13,18,22,28,37,45,55]
Composition does the job, and that's good. But that intermediate list is annoying. It might even become a problem if the tree is big enough. Can we do away with it? Let's look at the code again:
preorder :: Tree -> [Int]
preorder (Leaf a) = [a]
preorder (Node a l r) = [a] ++ preorder l ++ preorder r
The operations related to list are the singleton constructor [a]
and the append operator (++)
. It turns out we can generalize this! For having a singleton constructor, list is an instance of the typeclass Applicative. For having an associative append operator, list is an instance of the typeclass Semigroup. The singleton constructor of Applicative is pure
, and the associative operator of Semigroup is <>
. We can therefore rewrite the function to be:
preorder :: (Applicative f, Semigroup (f Int)) => Tree -> f Int
preorder (Leaf a) = pure a
preorder (Node a l r) = pure a <> preorder l <> preorder r
Does it still work?
>>> preorder ex3
error:
* Ambiguous type variable `f0' arising from a use of `print'
prevents the constraint `(Show (f0 Int))' from being solved.
Probable fix: use a type annotation to specify what `f0' should be.
...
Now that we've generalized preorder
, when we use it directly we need to specify what concrete type we want it to give back. We can do so by inline annotation.
>>> (preorder ex3 :: [Int])
[7,1,0,3,2,5,4,6,9,8,10]
What does this generalization gain us though? Let's check the list of instances of Semigroup. The following may catch your attention:
Num a => Semigroup (Product a)
Num a => Semigroup (Sum a)
Ord a => Semigroup (Max a)
Ord a => Semigroup (Min a)
Since Int
fulfills the requirement Num
and Ord
, we can use these instances. How? By changing the type annotation is one way.
>>> (preorder ex3 :: Product Int)
Product {getProduct = 0}
>>> (preorder ex3 :: Sum Int)
Sum {getSum = 55}
>>> (preorder ex3 :: Max Int)
Max {getMax = 10}
>>> (preorder ex3 :: Min Int)
Min {getMin = 0}
Another way is to use the getter of the corresponding instance.
>>> getProduct (preorder ex3)
0
>>> getSum (preorder ex3)
55
>>> getMax (preorder ex3)
10
>>> getMin (preorder ex3)
0
Cumulative sum is not in the existing list of Semigroup. Can we add cumulative sum to this as well? Pause and think. Does it have a binary associative operator?
Think a bit more.
A bit more before I tell you the answer.
......
Yes! For cumulative sum, the binary operator appends the two lists of sums, and adds the last sum of the left hand side to every sum in the right hand side. For example:
[7,1,0,3,2] <> [5,4,6,9,8,10]
[7,8,8,11,13] <> [5,9,15,24,32,42]
[7,8,8,11,13,18=5+13,22=9+13,28=15+13,37=24+13,45=32+13,55=42+13]
And it is associative. I am lazy, so this example is left to you.
Anyway, we need to write a wrapper.
newtype CumulativeSum a = CumulativeSum { getCumulativeSum :: [a] }
deriving Show
and an instance of Semigroup for it.
instance Num a => Semigroup (CumulativeSum a) where
c1 <> c2 = let
l1 = getCumulativeSum c1
l2 = getCumulativeSum c2
nn = last l1
in CumulativeSum (l1 ++ (fmap (nn +) l2))
and delegate the Functor and Applicative to list.
instance Functor CumulativeSum where
fmap f c = CumulativeSum (fmap f (getCumulativeSum c))
instance Applicative CumulativeSum where
pure a = CumulativeSum [a]
liftA2 f c1 c2 = CumulativeSum (liftA2 f (getCumulativeSum c1) (getCumulativeSum c2))
and Voila!
>>> (preorder ex3 :: CumulativeSum Int)
CumulativeSum {getCumulativeSum = [7,8,8,11,13,18,22,28,37,45,55]}
>>> getCumulativeSum (preorder ex3)
[7,8,8,11,13,18,22,28,37,45,55]
Admittedly, this is not the most elegant. We haven't made use of liftA2
of Applicative at all! We picked Applicative arbitrarily, and this is a sign that Applicative might not be the right pick here. But I am at my wit's end, so, until next time.
I know this is not the canonical
Tree
in Haskell, but rather a nonempty tree, but the result ofshow
looks better this way.To become Applicative list also has to be able to apply a list of functions to a list of parameters. Please also check hoogle for laws of Applicative and Semigroup.
To use the
(<>)
operator and the various Semigroup instances we need toimport Data.Semigroup
.Thank you to
#haskell
for advices.
Posted on June 9, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.