Sitting on my drive is an ever-expanding collection of papers related to functional programming and related topics. Some of these papers are considered classics in the field, having introduced or popularized techniques which are now staples of FP. Others provide rich perspectives on familiar topics, revealing deep mathematical principles embedded in common programming patterns.

Partly as motivation to actually read through these papers, and partly because I enjoy breaking down and translating such material for a wider community (e.g. in JavaScript), I am beginning a new article series, tentatively titled "The FP Papers" (imaginative, I know). This is the first entry in that series.

A Novel Representation of Lists and its Application to the Function "Reverse"

Authors	Year	DOI	Pages
R. John Muir Hughes	1986	https://doi.org/10.1016/0020-0190(86)90059-1	4

A representation of lists as first-class functions is proposed. Lists represented in this way can be appended together in constant time, and can be converted back into ordinary lists in time proportional to their length. Programs which construct lists using append can often be improved by using this representation. For example, naive reverse can be made to run in linear time, and the conventional ‘fast reverse’ can then be derived easily. Examples are given in KRC (Turner, 1982), the notation being explained as it is introduced. The method can be compared to Sleep and Holmström's proposal (1982) to achieve a similar effect by a change to the interpreter.

John Hughes is an author and co-author of several of my favorite FP papers, including "Why Functional Programming Matters" and "QuickCheck: A Lightweight Tool for Random Testing of Haskell Programs". In this paper, he introduces a rather clever data structure – an immutable list with a constant time (O(1)) append operation. These lists are frequently referred to as difference lists (DLists) or sometimes Hughes lists.

Mutation?

Even in a mutable setting, this would be a challenge. Consider a singly-linked list class with handles to the first and last nodes; append could conceivably done in constant time by "re-wiring" the first list's last node to point to the second list's first node:

List A:                       List B:
first         last            first         last
<1> -> <2> -> <3>             <a> -> <b> -> <c>
               |               ^
               —————append—————|

List A's last pointer would be updated accordingly (to point to <c>). Yet the above implementation has a glaring flaw; it uses structural sharing in a mutable setting. If anything modifies List B, the changes will now also show up in List A!

List A:              (List B)
<1> -> <2> -> <3> -> <a> -> <b> -> <c>
                             |
                           change (List B)
                             |
List A:                      V
<1> -> <2> -> <3> -> <a> -> <x> -> <c>
             (oops — List A now modified too)

In some cases this may be intended, but it is very easy to accidentally begin changing lists. Even if we never change node values, the act of appending itself is a mutation, so for instance if someone appended a List C to List B, they would accidentally be affecting List A also. Oops!

To get around this risk, a mutable linked list will typically copy nodes to be appended:

List A:                       List B:
first         last            first         last
<1> -> <2> -> <3>             <a> -> <b> -> <c>
               |               |      |      |
               —————append————copy   copy   copy
                               |      |      | 
                               V      V      V
                              <a> -> <b> -> <c>

The operation is now "safe" as modifying the original List B will not affect A (and vice-versa), but it is linear (O(n) time) with respect to List B's length.

Purity?

Perhaps there exists a reasonable implementation for constant-time-appendable mutable lists, but that isn't the point of Hughes's paper. In purely functional code, there aren't any (direct) mutations. The classical functional list structure (aka a Cons list) is a singly-linked list implemented as nested pairs; the first cell of each pair is a value, the second cell is a pointer to the next pair (or to the empty list). In Scheme, Haskell, and JS respectively:

; Scheme
(cons 1 (cons 2 (cons 3 nil)))
; (list 1 2 3)

-- Haskell
1 : 2 : 3 : []
-- [1, 2, 3]

// JavaScript
const cons = (val, next) => ({ val, next })
cons(1, cons(2, cons(3, null)))
// { val: 1, next: { val: 2, next: { val: 3, next: null }}}

The restriction of no mutation means that to append two such lists (producing a new List C) is O(n) where n is the length of the first list.

List A:                       List B:
first         last            first         last
<1> -> <2> -> <3>       ———>  <a> -> <b> -> <c>
 |      |      |        |
copy   copy   copy   append
 |      |      |        | 
 V      V      V        |
<1> -> <2> -> <3> ——————|
List C

We must copy <3> to change its next pointer, and we must in turn copy <2> to change its next to point to the new <3>, and so on. Yet we can safely share List B's nodes because nothing will ever actually change them.

Overall, this is quite similar to our first attempt at appending mutable lists, but reversed – we have to copy the left list rather than the right.

The Paper

Representations

With the context established, we can start actually reading this paper. Hughes begins by discussing a general notion of abstract data types, represented by concrete data types. Perhaps confusingly, Hughes's choice of "abstract" and "concrete/representational" will refer to familiar Cons lists and the titular novel list representation respectively.

Whenever an abstract data type A is represented by a concrete data type R, two functions must be defined:

abs: R -> A

rep: A -> R

In this case, the abs function will convert a Hughes list (aka DList) to a Cons list; the rep function will convert a Cons list to a DList. The abs function must be a left inverse of rep; that is, when round tripping a Cons list to DList (via rep) and back ( via abs), then you must end up where you started. In this sense, abs "undoes" rep.

Now, suppose we want to have an append function for Cons lists. Hughes makes the point that if we can define an append function for DLists, we get the Cons version for free – we can convert Cons lists to DLists, append them, and then convert the resulting DList back to a Cons List.

             ————— ConsListAppend ————————————————————————————————
             |                                                   |
ConsListA —> | —rep—> DListA —> ———————————————                  |
             |                  | DListAppend | —> DListC —abs—> | —> ConsListC
ConsListB —> | —rep—> DListB —> ———————————————                  |
             |                                                   |
             —————————————————————————————————————————————————————

Hughes frames this relationship in terms of an interesting general law:

[given f: A -> A and g: R -> R, g "implements" f precisely if:]

FORALL r: R.f(abs r) = abs(g r)

This law is stated in terms of partially applied functions, a topic slightly out of scope for this article. Thankfully, partial application isn't strictly necessary to understand DLists per se; the main takeaway for this law is that a "correct" DListAppend would give us ConsListAppend as previously illustrated.

The DList, Defined

Hughes now drops, with almost no ceremony, the actual "novel representation of lists" promised.

…the representation function, rep, is just append.

rep L = append L

(Here we are making use of the KRC convention that all functions are curried.)

Careful! There are some subtleties packed into this short snippet. Hughes uses append L to mean a function that prepends L (to an implicit list L2). This might clarify why that is so:

append A B -- a list     (A ++ B / append B to A / prepend A to B)
append A   -- a function (A ++ ? / append ? to A / prepend A to ?)

In JavaScript, we might write this rep function as follows:

const rep = consList1 => (consList2 => append(consList1, consList2))
//              ^        \                                         /
//              |         —————————————————————————————————————————
//              |                            |
//         input list L                output DList

This underscores a very important point – a DList is a function. If we called rep(cons(1, cons(2, cons(3, null))) we'd get back a DList representing the Cons list [1, 2, 3]; yet the typeof that DList value would be 'function'. How can this be? Well, as Hughes points out, we can easily convert the DList back to a Cons list – by passing in the empty list null. Try it below:

Or in Haskell:

-- (:) is Cons, ([]) is empty list "Nil"
-- [1, 2, 3] is a Cons list (1 : 2 : 3 : [])
-- (++) is append

type DList a = [a] -> [a]

repr :: [a] -> DList a
repr cl = (cl ++)

abst :: DList a -> [a]
abst dl = dl []

cl1 = [1, 2, 3]

dl1 = repr cl1
cl2 = abst dl1

main :: IO ()
main = print (cl1 == cl2) -- True

The "trick" is that a DList function captures the passed-in Cons list used to construct it via closure. So a DList "contains" a Cons list, which can be recovered by applying the DList.

Constant Time Append

Once again, Hughes introduces the killer feature of DLists with little fanfare:

The new representation is interesting because two representations can be appended together by composing them as functions, so we can define

appendR f g = f · g

(Function composition is written using a dot in KRC.)

Function composition is the act of creating a new function which connects the output of function g to the input of function f. In JS:

const compose = (f, g) => (x => f(g(x))
//              \    /    \           /
//               ————      ———————————
//                |             |
//           input funcs     the composition (a fn)

Notice that function composition returns a new function but it does NOT invoke either f or g to do so. Rather, the resulting composition when applied calls g and then f in that order. So regardless of how expensive f or g are, the act of composing them is free – that is, constant time. Actually using the composition will still be expensive, however.

//                   O(1)   O(f) + O(g)
//                   ————   ———————————
//                     |     |
const compose = (f, g) => (x => f(g(x)))

How does this apply to DLists? Since DLists are functions, they can be composed together in constant time… and the result is a new DList:

-- in Haskell:
c12 = [1, 2]
c34 = [3, 4]

d12 = repr c12
d34 = repr c34

d1234 = d12 . d34

main = print (abst d1234) -- [1, 2, 3, 4]

How does this work? Hughes lays out a short and clear logical proof in his paper, but we can informally note that since a DList prepends its internal list, composing two DLists is a function which first prepends the right list, then prepends the left list, to whatever list is passed in:

DListA:                   DListB:
prepend (<a> -> <b>)      prepend (<c> -> <d>)

DListC = (DListA · DListB):
prepend (<c> -> <d>) then prepend (<a> -> <b>)

abs DListC (i.e., apply DListC to Nil):
prepend (<c> -> <d>) to Nil
then prepend (<a> -> <b>) to the above
result is (<a> -> <b> -> <c> -> <d>)

The careful reader will probably have some concerns at this point. As Hughes notes:

Function composition is an efficient operation. It can always be performed in constant time, so it remains to be shown that abs is reasonably efficient.

Provided that a 'functional list' is built using only rep and appendR, abs can convert it into an ordinary list in time proportional to its length.

This seems like a cheat. Sure, DLists can be composed in O(1) time, but we still need O(n) time to convert a Cons list to a DList or vice-versa! What's the point?

The answer is that functions are lazy. The code "inside" of a DList – the body of the function, containing the append operation – does not execute just because a DList is constructed. And we can build a DList without starting from a Cons list!

-- Haskell
d12 = \c2 -> [1, 2] ++ c2
d34 = \c2 -> [3, 4] ++ c2

d1234 = d12 . d34

In the above example, no actual Cons list was ever constructed. Two functions that could produce Cons lists were defined, but neither function body actually ran, so the runtime cost remained hypothetical. Yet we defined a new DList which represents the concatenation of the two hypothetical Cons lists, and all of this in constant time.

Converting this list back down to a Cons list does take linear time, of course.

// JS: O(n)
d1234(null) // cons(1, cons(2, cons(3, cons(4, null))))

-- HS: O(n)
d1234 [] -- [1, 2, 3, 4]

When abs applies such a function to the empty list to convert it back into an ordinary list, it computes

append L1 (append L2 … (append Ln []) …).

The cost of computing (append A B) is independent of B—it is just the length of A. Therefore, the cost of computing the expression above is the sum of the lengths of L1,L2,…,Ln, that is, the length of the final list.

And that is one of the primary tradeoffs of the DList: if you expect to perform many append operations, and only need the final result at the very end, a DList can prevent duplicate work; often this will reduce an overall O(n^2) algorithm to O(n).

Reverse

"A Novel Representation of Lists" is just the first half of this paper's title; the second half is "and its Application to the Function 'Reverse'". Hughes uses a naive singly-linked list reverse function to illustrate an O(n^2) algorithm which can be improved via DLists.

// JS
const reverse = cl => {
    if (!cl) return cl
    return append(reverse(cl.next), cons(cl.val, null))
}

-- Haskell
reverse :: [a] -> [a]
reverse [] = []
reverse (x:xs) = reverse xs ++ [x]

Consider the the following pseudocode expansion of the reverse call on a cons list [1, 2, 3, 4]:

reverse [1, 2, 3, 4]

// recursive applications of reverse
append(reverse [2, 3, 4], [1])
append(append(reverse [3, 4], [2]), [1])
append(append(append(reverse [4], [3]), [2]), [1])

// sequential applications of append
append(append(append([4], [3]), [2]), [1])
  - append(append([_, 3], [2]), [1])
  - append(append([4, 3], [2]), [1])
append(append([4, 3], [2]), [1])
  - append([_, _, 2], [1])
  - append([_, 3, 2], [1])
  - append([4, 3, 2], [1])
append([4, 3, 2], [1])
  - [_, _, _, 1]
  - [_, _, 2, 1]
  - [_, 3, 2, 1]
  - [4, 3, 2, 1]

Because a Cons list append is proportional to the left list length, and as we keep building up a bigger and bigger list, we end up performing the same work. append takes an average of O(n/2) time, and we do it for n elements, resulting in O(n^2) time overall.

Using DLists, we switch to an O(1) append:

-- Haskell

reverseC :: [a] -> [a]
reverseC c = abst (rev c)

rev :: [a] -> DList a
rev [] = id
rev (x:xs) = (rev xs) . (x :)

main = print (reverseC [1, 2, 3, 4]) -- [4, 3, 2, 1]

Conclusion

It should be noted that this is not the only way to define an efficient reverse function; this problem is used mostly as a small, familiar example to illustrate how DLists work. The DList solution generalizes a hand-coded tail-recursive reverse:

const reverse = initList => {
    const go = (oldList, newList) => {
        if (!oldList) return newList
        const newerList = cons(oldList.val, newList)
        return go(oldList.next, newerList)
    }
    return go(initList, null)
}

Or using foldl in Haskell:

reverse :: [a] -> [a]
reverse = foldl (flip (:)) []

Hughes also goes on to demonstrate another use case for DLists, breaking up a line of text (in the form of a list of characters) into words (separated by a space).

The paper finishes with some notes regarding performance – namely, that a hand-coded efficient reverse was faster than using DLists, but of course DLists were significantly faster than the naive approach. At the time of publication (1986), DLists were found to have a practical performance improvement for certain applications.

What about DLists today? As the popular Haskell package dlist notes:

Difference lists are a list-like type supporting O(1) append. This is particularly useful for efficient logging and pretty printing (e.g. with the Writer monad), where list append quickly becomes too expensive.

That package hides the clever yet tiny implementation of DLists behind an easy-to-use API; similar work could be done easily enough in JavaScript.

The dlist README has links to additional interesting resources on the subject. Ultimately I think that the DList implementation is worth examining because it show how representing data lazily, as a function, can unlock new and surprising capabilities. This shows up in other functional representations, such as Church encodings.

Hughes's paper is a small 4-page gem, and this article is… not. Hopefully however it makes the idea more accessible to a typical JavaScript programmer. If so, I may add more entries to this series in the future.

Blog

The FP Papers: Hughes Lists

Gabriel Lebec