When you realize a lazy sequence, you don’t have to retain the head of the sequence, so even though it caches things, as you consume it, what you no longer have a binding for will get garbage collected. You can use dorun
or doseq
for that for example. You can also just loop/recur
over it with first
and rest
or next
, or use reduce
on it. All those won’t retain prior elements and won’t run out of memory even when used on a lazy-seq.
This will retain the head and return the full sequence causing memory issues if you run with a small heap size:
(def a (doall (range 1e7)))
java.lang.OutOfMemoryError: Java heap space
clojure.lang.Compiler$CompilerException: Syntax error macroexpanding at (NO_SOURCE_FILE:1:8).
But these will all be fine, since they don’t retain the head:
(def a (dorun (range 1e7)))
#'user/a
(reduce + (range 1e7))
49999995000000
(loop [xs (range 1e7) sum 0]
(if-let [ns (next xs)]
(recur ns (+ sum (first xs)))
(+ sum (first xs))))
49999995000000
This is actually one of the great things about lazy sequences over eager ones, is that they let you do computations like that where the whole sequence wouldn’t fit in memory.
I wouldn’t say eduction
solves the memory issue, since lazy-seqs don’t suffer from one to be solved in the first place.
That just depends on the code you have. Both eduction
and lazy-seq
don’t actually do anything when called and the computation is delayed until it is needed, so it will fail if you close the reader to line-seq
before running the computation.
(def a (with-open [rdr (BufferedReader. (StringReader. "1\n2\n3\n4"))]
(eduction (map parse-long) (map inc) (line-seq rdr))))
a
java.io.IOException: Stream closed
clojure.lang.ExceptionInfo:
Same as for lazy-seq:
(def a (with-open [rdr (BufferedReader. (StringReader. "1\n2\n3\n4"))]
(->> (line-seq rdr) (map parse-long) (map inc))))
a
java.io.IOException: Stream closed
clojure.lang.ExceptionInfo:
So I have no idea what the book is trying to say without maybe seeing the full page about it from the book to better understand what they might be alluding too.
Edit:
Oh it might be alluding to this:
(def a (eduction (map identity) (range 1e7)))
(reduce + a)
49999995000000
(def a (map identity (range 1e7)))
(reduce + a)
Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "main"
Where again since we’re retaining the head to the sequence, when we later reduce over it, it can run out of memory. Eduction
I guess does solve this kind of scenario, because it won’t realize the sequence pointed too by a
.
(def a (map identity (range 1e7)))
(let [a' a]
(def a nil)
(reduce + a'))
49999995000000
So, with eduction
you’re less worried about accidentally retaining the head
I guess.
And for the curious who wonder, but doesn’t a'
point to the head and thus retain the whole sequence in memory while it is being reduced? The answer is an optimization Clojure does called locals-clearing
. Since a'
isn’t used in the let
after the reduce, its reference actually gets cleared at the point of reduce
(before reduce is ran), which is why the head isn’t retained in that let
and why the GC will garbage collect elements from the lazy-sequence as they are being reduced over.
Fun fact is that Clojurescript doesn’t have this optimization, and doing the same in Clojurescript will consume a lot of memory, so eduction
would be even more helpful in the case of Clojurescript.