What's That Typeclass: Foldable

serokell

Serokell

Posted on December 2, 2021

What's That Typeclass: Foldable

There is a function called reduce in programming languages, which can be applied to some containers. It goes element by element, applying a given binary function and keeping the current result as an accumulator, finally returning the value of the accumulator as a summary value of the container.

For example, in Python:

> reduce(lambda acc, elem: acc + elem, [1, 2, 3, 4, 5]) # sum of elements in a list
15

Enter fullscreen mode Exit fullscreen mode

Haskell, in turn, has two fundamental functions representing reducing, or, as we call it, folding – foldl and foldr – that differ in the order of the folding. foldl reduces elements of a container from left to right (as reduce in other languages usually does), while foldr reduces from right to left.

These two functions are core methods of the Foldable type class, about which we are going to learn in this article.

picture with text foldable and "typeclass?"

foldr and foldl for a list

First, let’s figure out what foldr and foldl are for a list.

Consider the following type signatures:

foldr :: (a -> b -> b) -> b -> [a] -> b
foldl :: (b -> a -> b) -> b -> [a] -> b

Enter fullscreen mode Exit fullscreen mode

foldr

The foldr signature corresponds to the function folding the list from right to left. It takes a function f of type a -> b -> b as the first argument, an initial (or base) value z of type b as the second argument, and a list xs = [x₁, x₂, ..., xₙ] of type [a] as the third one.

Let’s trace it step-by-step.

  1. Introduce an accumulator acc, which is equal to z at start.
  2. f is applied to the last element of the list (xₙ) and a base element z, the result of which is written in the accumulator – xₙfz.
  3. f is applied to the second-to-last element of the list and the current value of the accumulator – xₙ₋₁f(xₙfz).
  4. After going through all the elements of the list, foldr returns the accumulator as a summary value, which finally equals to x₁f(x₂f... (xₙ₋₁f(xₙfz)) ... ).
acc = z
acc = xₙ `f` z
acc = xₙ₋₁ `f` (xₙ `f` z)
...
acc = x₁ `f` (x₂ `f` ( ... (xₙ₋₁ `f` (xₙ `f` z)) ... )

Enter fullscreen mode Exit fullscreen mode

This process could be represented as a picture:

foldr visualisation

foldl

Unlike foldr, foldl performs left-to-right folding of a list. Therefore, f is applied to z and x₁ first – acc = zfx1 – then f is applied to the current value of the accumulator and x₂acc = (zfx₁)fx₂ – and so on. Finally, foldl returns acc = ( ... ((zfx₁)fx₂)f... xₙ₋₁)fxₙ.

acc = z
acc = z `f` x₁
acc = (z `f` x₁) `f` x₂
...
acc = ( ... (z `f` x₁) `f` x₂) `f` ...) `f` xₙ₋₁) `f` xₙ

Enter fullscreen mode Exit fullscreen mode

The corresponding illustration of these operations:

foldl visualisation

Haskell definition

The recursive Haskell definition of foldr is the following:

instance Foldable [] where
  foldr :: (a -> b -> b) -> b -> [a] -> b
  foldr _ z [] = z
  foldr f z (x:xs) = x `f` foldr f z xs

Enter fullscreen mode Exit fullscreen mode

foldl is similar; implementating it is one of the exercises below.

Simple examples

Let’s consider a couple of simple examples of applying foldr and foldl to a list:

-- addition
ghci> foldr (+) 0 [1, 2, 3] -- 1 + (2 + (3 + 0))
6
ghci> foldl (+) 0 [1, 2, 3] -- ((0 + 1) + 2) + 3
6

-- exponentiation
ghci> foldr (^) 2 [2, 3] -- 2 ^ (3 ^ 2)
512
ghci> foldl (^) 2 [2, 3] -- (2 ^ 2) ^ 3
64

Enter fullscreen mode Exit fullscreen mode

As you can see, it doesn’t matter to the addition if the folding is right-to-left or left-to-right since its operator is associative. However, when folding with an exponentiation operator, the order is significant.

Generalization of foldr and foldl

Fortunately, it’s possible to fold not only lists! We can implement the instance of Foldable whenever a data type has one type argument, in other words, when its kind is * -> *.

To figure out the kind of Haskell data type, you can write :kind (or :k) in GHCi. To display the type of a Haskell expression, use :type (or :t).

For instance, a list has one type argument – the type of the list’s elements:

ghci> :kind []
[] :: * -> *

ghci> :type [1 :: Int, 2, 3]
[1, 2, 3] :: [Int]
ghci> :type ['a', 'b']
['a', 'b'] :: [Char]

Enter fullscreen mode Exit fullscreen mode

Maybe a data type also has one type argument – the type of presented value:

ghci> :kind Maybe
Maybe :: * -> *

ghci> :type Just (2 :: Int)
Just 2 :: Maybe Int
ghci> :type Just "abc"
Just "abc" :: Maybe [Char]
ghci> :type Nothing
Nothing :: Maybe a -- can't decide what type `a` is

Enter fullscreen mode Exit fullscreen mode

Int and String have no type arguments, and you can’t fold them:

ghci> :kind Int
Int :: *
ghci> :type (1 :: Int)
1 :: Int

ghci> :kind String
String :: *
ghci> :type ""
"" :: [Char]
ghci> :type "ab"
"ab" :: [Char]

Enter fullscreen mode Exit fullscreen mode

Either a b data type has two type arguments corresponding to Left and Right values, so there is also no reasonable way to fold an Either a b. But we could define instance Foldable (Either a) since Either a has an appropriate * -> * kind. The same for the pair data type – instance Foldable ((,) a) – is possible. However, such instances are not intuitive since they operate on only one of type arguments.

ghci> :kind Either
Either :: * -> * -> *

ghci> :type Left (1 :: Int)
Left 1 :: Either Int b -- can't decide what type argument `b` is
ghci> :type Right 'a'
Right 'a' :: Either a Char -- can't decide what type argument `a` is

ghci> :kind Either Int
Either Int :: * -> *
ghci> :kind (,) Char
(,) Char :: * -> *

Enter fullscreen mode Exit fullscreen mode

For those data types which could have a Foldable instance, generalized versions of foldr and foldl have the following type signatures:

foldr :: Foldable t => (a -> b -> b) -> b -> t a -> b
foldl :: Foldable t => (b -> a -> b) -> b -> t a -> b

Enter fullscreen mode Exit fullscreen mode

Some instances as an example

Maybe a

Let’s take a look at other instances of Foldable. One easy-to-understand example is Maybe a’s instance. Folding Nothing returns a base element, and reducing Just x with foldr f z (or foldl f z) is applying f to x and z:

instance Foldable Maybe where
  foldr :: (a -> b -> b) -> b -> Maybe a -> b
  foldr _ z Nothing = z
  foldr f z (Just x) = f x z

Enter fullscreen mode Exit fullscreen mode

Usage examples:

ghci> foldr (+) 1 (Just 2)
3
ghci> foldr (+) 1 Nothing
1

Enter fullscreen mode Exit fullscreen mode

BinarySearchTree a

A more interesting and useable Foldable instance could be defined for a BinarySearchTree a data type.

Consider a binary search tree, each node of which either:

  • stores a value of type a and has left and right subtrees;
  • is a leaf without a value.

Moreover, for each non-leaf node n:

  • all values in its left subtree should be less or equal than the n’s value;
  • and all values in its right subtree should be greater than the n’s value.

Binary search tree for the integer numbers

This structure matches the following Haskell definition:

data BinarySearchTree a
  = Branch (BinarySearchTree a) a (BinarySearchTree a)
  | Leaf

Enter fullscreen mode Exit fullscreen mode

Imagine that we want to reduce the whole tree to just one value. We could sum values of nodes, multiply them, or perform any other binary operation. So, it’s reasonable to define a folding function that matches your goals. We suggest you try to implement a Foldable instance on your own; it’s one of the exercises below. Take into account that to define foldr, we need to go through the tree’s elements from right to left – from right subtree to left subtree through the non-leaf node connecting them.

What do you need to define a Foldable instance?

To answer this question, we need to look into another method of Foldable, foldMap.

foldMap

foldMap has the following declaration:

foldMap :: (Monoid m, Foldable t) => (a -> m) -> t a -> m

Enter fullscreen mode Exit fullscreen mode

It’s important to note that foldMap has a Monoid constraint. Its first argument is a function that maps each element of a container into a monoid, the second argument is a container itself. After mapping elements, the results are combined using (<>) operator. The order of folding is from right to left, so foldMap could be implemented via foldr.

Let’s look at how exactly that could be done:

foldMap :: (Monoid m, Foldable t) => (a -> m) -> t a -> m
foldMap f = foldr (\x acc -> f x <> acc) mempty

Enter fullscreen mode Exit fullscreen mode

foldMap doesn’t have a base element since only the elements of the container are reduced. However, foldr does, so it perfectly makes sense to use the identity of monoid – mempty. We can also see that the folding function f is composed with (<>), thus, the current result is appended to a monoid accumulator on each step.

Recall how list elements can be summed with foldr:

ghci> foldr (+) 0 [1, 2, 3]
6

Enter fullscreen mode Exit fullscreen mode

To do the same in terms of monoid, consider a monoid under addition – Sum. We can use foldMap and Sum to perform the addition of list elements:

-- import `Data.Monoid` to get `Sum`
ghci> import Data.Monoid

ghci> foldMap Sum [1, 2, 3]
Sum {getSum = 6}

Enter fullscreen mode Exit fullscreen mode

Note that here constructor Sum is a function that turns a list element into monoid. As expected, we get the same result wrapped in Sum that can easily be accessed via getSum.

To conclude, foldMap and foldr are interchangeable, the difference is that the former receives the combiner and the base element from the Monoid instance, and the latter accepts them explicitly.

Minimal complete definition

We have already shown how foldMap is implemented using foldr. It’s not obvious, but foldr could also be implemented via foldMap! It’s off-topic for this article, therefore, please proceed to the documentation for the details.

Consequently, to create the Foldable instance, you can provide a definition of either foldr or foldMap, which is exactly the minimal complete definition – foldMap | foldr. Note that you shouldn’t implement foldMap in terms of foldr and simultaneously foldr in terms of foldMap since this will just loop forever. Thus, you implement one of them, and Haskell provides the definitions of all Foldable’s methods automatically.

Strict foldl'

Haskell uses lazy evaluation by default, and it can have a positive effect on performance in a lot of cases. But laziness might also impact it negatively sometimes, and folding is exactly this case.

Imagine that you want to sum up the elements of a really huge list. When using foldl or foldr, the value of the accumulator isn’t evaluated on each step, so a thunk is accumulated. Considering foldl, in the first step the thunk is just zfx₁, in the second – (zfx₁)fx₂, that’s not a big deal. But in the final step, the accumulator stores ( ... (zfx₁)fx₂)f...)fxₙ₋₁)fxₙ. If the list’s length is above ~10^8, the thunk becomes too large to store it in a memory, and we get ** Exception: stack overflow. However, that’s not the reason to drop Haskell since we have foldl'!

foldl' enforces weak head normal form in each step, thus preventing a thunk from being accumulated. So foldl' is often a desirable way to reduce a container, especially when it’s large, and you want a final strict result.

Here are some “benchmarks” that may vary depending on a computer’s characteristics. But they are illustrative enough to grasp the difference between the performance of a strict and a lazy fold.

-- set GHCi option to show time and memory consuming
ghci> :set +s

-- lazy `foldl` and `foldr`
ghci> foldl (+) 0 [1..10^7]
50000005000000
(1.97 secs, 1,612,359,432 bytes)

ghci> foldr (+) 0 [1..10^7] 
50000005000000
(1.50 secs, 1,615,360,800 bytes)

-- lazy `foldl` and `foldr`
ghci> foldl (+) 0 [1..10^8]
*** Exception: stack overflow

ghci> foldr (+) 0 [1..10^8]
*** Exception: stack overflow

-- strict `foldl'`
ghci> foldl' (+) 0 [1..10^8]
5000000050000000
(1.43 secs, 8,800,060,520 bytes)

ghci> foldl' (+) 0 [1..10^9]
500000000500000000
(15.28 secs, 88,000,061,896 bytes)

ghci> foldl' (+) 0 [1..10^10]
50000000005000000000
(263.51 secs, 1,139,609,364,240 bytes)

Enter fullscreen mode Exit fullscreen mode

Other methods of Foldable

foldl, foldr, and foldMap could be considered the core for understanding the nature of Foldable. Nonetheless, there are others which are worth mentioning. You get these automatically by defining foldr or foldMap.

  • length, which you might know, is implemented via foldl'.
  • maximum and minimum use strict foldMap' in their definitions.
  • null, which checks if a container is empty, is also implemented by reducing the latter with foldr.
  • You may wonder, where is the good old for loop? Fortunately, Haskell also provides it, which would have been impossible without Foldable.

All in all, even if you use foldr and foldl themselves very seldom, you will find other primitives of Foldable quite useful. You might proceed to the documentation to get to know them better.

Exercises

The theoretical part of our article is over. We hope that you know what Foldable is now! We suggest you to try out your new knowledge with our mini exercises.

  1. Which definition is correct?

  2. Implement foldl recursively.

  3. Implement foldr for the BinarySearchTree a.

  4. Implement a reverse function for lists via foldl.

  5. Implement a prefixes function for lists via foldr.

  6. Implement the sum of the squares via both foldr and foldMap.

Thank you for reading! If you would like to read more Haskell articles like these, be sure to follow us on Twitter or Dev. You can also subscribe to our newsletter below to receive new Haskell articles straight in your email inbox.

💖 💪 🙅 🚩
serokell
Serokell

Posted on December 2, 2021

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related