lazy-seq in Clojure
Namc
Posted on November 28, 2017
Basic idea :
Evaluation of an expression is delayed until the value is needed.
There are two ways to achieve the above :
- use lambda functions at runtime
- use macros/special forms at compile time
With lazy eval techniques, it is possible to construct infinite data structures that are evaluated as consumed, using lambdas, closures and recursion. In clojure, they are generated using lazy-seq
and cons
forms.
Let’s take an example from the Clojure docs for lazy-seq
user> (def fib-seq
(lazy-cat [0 1] (map + (rest fib-seq) fib-seq)))
;; => #'user/fib-seq
lazy-cat
is another macro to concatenate lazy sequences. We shall write the above expression to an alternate translation which uses lazy-seq
user> (def fib-seq (concat (lazy-seq [0 1]) (lazy-seq (map + (rest fib-seq) fib-seq))))
;; => #'user/fib-seq
user> (doall (take 2000 fib-seq))
This lazy-seq
evaluates for very large indexes without giving an OutOfMemoryException/StackOverFlowError. (Might throw ArithmeticException for extremely large integer, though)
How this works internally :
-
lazy-seq
executes the body once the first time it is accessed. - the result is cached and used whenever the function is called in future.
Going back to the example;
[0 1]
is the base case for the problem. In Fibonacci’s sequence, the first two values are 0 and 1. The consecutive computation of each value requires summation of previous value and the value before that one. Hence, we need at least two values to start the process.
(map + (rest fib-seq) fib-seq)
Here, rest will return the part of fib-seq
without it’s head. map
will combine two sequences using +
and produce a next sequence. Let’s take the fibonacci sequence fib-seq-temp
user> (def fib-seq-temp [0 1 1 2 3])
user> (map + (rest fib-seq-temp) fib-seq-temp)
;; => (1 2 3 5)
The sequences fed to map are part of same basic sequence . The first sequence (rest [seq])
is the base seq without the head. The second sequence is the base seq, including the first value.
What is the base sequence here?
According to fibonacci number’s definition, A(n) = A(n-1) + A(n-2)
. We need to be able to access previously computed values in order to extend the sequence. So, we use the recursive reference to fib-seq
to access our previous results. The base seq is typically [0 1]
here to begin with.
Let’s follow the steps mentioned above and run through the process :
Take the first element of fib-seq
(base case [0 1] ) — 0
Take the second element of fib-seq
— 1
Take the third element of fib-seq
— Stuck, aren’t we?
Here, we use map to generate a sequence which is used as remaining values.
The sum of rest [0 1] which is 1, and first item of [0 1] , which is zero is 1 ; which is our result below.
user> (map + (rest [0 1]) [0])
;; => (1)
Similarly, let’s repeat the process for generating fourth element.
We use map
again for generating base seq.
Compute the sum of second item of rest [0 1]
, whose value is (1) as computed above; and second item of [0 1]
user> (map + (rest [0 1]) [1])
;; => (2)
Easy so far?
Now generate the fourth element.
Sum the third element of rest [0 1]
(which we computed as 1) and fourth item generated from [0 1]
(which was 2 ). And we get 3 .
This will keep building up to match the desired definition for series. To understand this better, let’s add a print statement to our original definition.
user> (def fib-seq
(lazy-cat [0 1]
(map
(fn [a b]
(println (format "%d + %d" a b)
(+ a b))
(rest fib-seq) fib-seq)))
;; => #'user/fib-seq
user> (doall (take 10 fib-seq))
1 + 0
1 + 1
2 + 1
3 + 2
5 + 3
8 + 5
13 + 8
21 + 13
;; => (0 1 1 2 3 5 8 13 21 34)
—-
Working with different data-types
-
map
,reduce
,take
etc work in terms offirst
,rest
andmore
. -
first
(as well as others) ensure if the collection implementsISeq
. - If yes, then these functions are implemented directly. Else, a
seq
view of the collection is created to implementfirst
/more
/rest
.
lazy-seq
macro
- The macro uses thunk to store the sequence-generating calculations.
- When an element/chunk of sequence is requested, next thunk is called to retrieve the values.
- The thunk creates next subroutine to represent the tail of the sequence (lose your head).
- These thunks implement sequence interface, each thunk is called just once, and then cached. So the realized portion is just a sequence of values.
Holding onto the head
Take these two methods to generate infinite lazy seqs of randoms.
user> (defn infinite[] (lazy-seq (cons (rand) (infinite))))
;; => #'user/infinite
--
user> (defn infinite-1[] (lazy-seq (cons (rand) (infinite-1))))
;; => #'user/infinite-1
user> (def zyx (infinite-1))
If you try to run (last (infinite))
; it won’t crash, but will neither terminate. However, (last zyx)
will crash quickly with an OutOfMemoryError
.
This is because of “holding onto the head”.
In (last (infinite))
, Clojure disposes the earlier members of sequence because it is able to determine that they are not of any use. Hence, memory is not lost. Whereas, in (last zyx)
, the program is holding on to the definition , and hence, it is not able to clear the memory. This causes it to throw OutOfMemoryError .
So what’s the point of lazy-seq
?
lazy-seq
is just one of many possible ways to create lazy sequences. And there are several other ways to do it in clojure.
If a sequence is not lazy, it often holds onto it’s head, which consumes a lot of heap space. If it is lazy, it is computed, then discarded as it is not used for further computations. This is particularly useful when you are dealing with huge sequences, and only using a small part of them for sequential computation.
—
More references :
https://purelyfunctional.tv/lesson/what-are-lazy-sequences/
http://theatticlight.net/posts/Lazy-Sequences-in-Clojure/
http://www.thesoftwaresimpleton.com/blog/2014/09/08/lazy-seq/
Posted on November 28, 2017
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.