Code question

I have the following code from manning clojure essential reference book page 467.

(defn fast-producer [n]
(->> (into () (range n))
(map #(do (println “produce” %) %))))

(defn slow-consumer [xs]
(keep
#(do
(println “consume” %)
(Thread/sleep 2000))
xs))

(slow-consumer (fast-producer 5))

;; the output is:

produce 4
consume 4 ; … wait 2 seconds
produce 3
consume 3 ; … wait 2 seconds
produce 2
consume 2 ;; wait 2 seconds
produce 1
consume 1 ;; waiat 2 seconds
produce 0
consume 0 ;; wait 2 seconds

My question is:
why it’s not the case that all the produce gets printed before consumer?
i.e. why producer and consuer alternates?

thanks

sun

The function map that you are calling in function fast-producer is lazy. In some cases it will produce values up to 32 or so ahead of the consumer, but in many cases it will never do any work to produce another value until some consumer advances past the part of the sequence that has been created so far.

thatnks for the reply, but how the context switching works?

In other words, initially fst-producer do nothing.
and now slow-consumer gets invoked.
upon seeing xs, it asks fast-producer to generate data.
But I don’t understand how this “context switching” takes place.

thanks

You can think of map as doing:

[#(do (println “produce” 0) 0), #(do (println “produce” 1) 1), #(do (println “produce” 2) 2), ...]

And then keep as doing:

[(fn[] (println “consume” #(do (println “produce” 0) 0))
(Thread/sleep 2000))), (fn[] (println “consume” #(do (println “produce” 1) 1))
(Thread/sleep 2000))), ...]

So now when something loops over that and reads each element, you can see that as the first element is realized, it would first print “produce 0” and then print “consume 0” and then sleep 2 seconds, and then move the realizing the next element, etc.

By definition, a lazy sequence generates/computes its elements on demand, as they are requested. With that in mind, it helps to think through the computing steps required when a single element - e.g. the first one - is requested:

  1. Within slow-consumer's keep's argument function, an element from the enclosing input is requested in order to print it out
  2. That element is taken from the output of fast-producer, which has produced none yet
  3. Within fast-producer, the function within map is invoked on the next element, prints it and returns it
  4. Finally, slow-consumer's keep's arg. fn has something to chomp on, prints the value, then calls Thread/sleep
  5. Thread/sleep always returns nil, and therefore nil is returned from keep's argument fn
  6. As an element is being requested, but none has been found so far (keep is not allowed to return nil values, see also previous step), keep invokes the argument function on its next input element (the next element output by fast-producer), etc

So to answer your original question, the consumer and producer values alternate because the consumer takes its values from a lazy sequence (the producer), whose values are generated on demand. At the (first) time slow-consumer requests an element from its input, only that element is made available by its input collection (since it is generated on demand by fast-producer), so there is no context switching as you suggested, but just a single value output by the first function (fast-producer) into the client function requesting it (the slow-consumer).

However there is a second issue in your specific example, due to a lazy sequence’s obligation to provide an element, even when none is available (step 6 above), which may have added to your initial confusion.

We can illustrate the above by playing around a bit with your example and separating the phases of the pipeline:

(def output (slow-consumer (fast-producer 5)))
;; #'user/output
(first output)
;; produce 4
;; consume 4
;; produce 3
;; …

(def from-producer-only (fast-producer 5))
;; #'user/from-producer-only
(first from-producer-only)
;; produce 4
;; 4
;;user>

(def from-consumer-only (slow-consumer (reverse (range 5))))
;; #'user/from-consumer-only
(first from-consumer-only)
;; consume 4
;; consume 3
;; consume 2
;; …

One more thing about lazy sequences worth noting however: the side-effects happen only once, i.e. when the computation occurs. All subsequent requests for an element come from memory:
(first from-producer-only)
;;4

And, as a direct consequence of that, all retained elements consume space, which can become a problem if the lazy sequence is large and is referenced for a long time (e.g. by a var).