Reducers, what are they for?


I was looking at the docs for reducers, and realized that I never bothered to learn them.

We have Transducers now, which I’m a big fan of, and it seems they replace a lot of the use cases for reducers. On the other hand reducers do seem to have some nice things of their own, like the ability to automatically parallelize.

So… does anyone still use reducers? When would you choose reducers over transducers?


I think Peter Schuck explains it well in his 2014, Transducers: Clojure’s Next Big Idea.

Reducers introduced the idea to Clojure that functions operating over a collection could be combined into one function and then operate on the entire collection in one go. Reducers decoupled the implementation of inputs from the operation you wish to perform on the inputs. Reducers can only be evaluated eagerly, not lazily, and not over a core.async channel. As core.async becomes more and more popular with Clojure, reducers are left behind. Reducers’ eager evaluation means the work is all done at once; the output cannot be a lazy sequence anymore. Additionally, reducers use macros to perform their magic instead of using function composition, which means we have to repeat our logic to handle different abstractions.

Transducers, like Reducers, let us combine (compose) the fns operating over our collections. They go further though. Unlike Reducers, Transducers are not limited to operating eagerly. Also unlike Reducers, Transducers can (and do) operate not only on all the typical collection types like (maps, sets, vectors, lists, sequences) but also on core.async channels too. Clojure core and core.async have adopted Transducers internally so that now, for example, the code for “map” need only be implemented one time, as a Transducer. That logic applies to vectors, sets and core.async channels. And it can be used for both eager and lazy processing. And our processing steps can be composed from reusable parts (Transducers).



All of this is true - reducers are an interesting pitstop on the journey to reusable composable transformations ultimately embodied in transducers. BUT it’s not why reducers were created or why they are (still) interesting.

Reducers are interesting because they were intended to address the problem that sequence transformation stacks are not (by themselves) suitable for parallelization. The driver here being that in longer term, languages that will remain interesting (to managers and architects) are those that can automatically take advantage of parallelism when applying transformations.

Reducers replace nested sequence transformations with a functional representation of the stack of transformations (as do transducers) and provide a native way to do parallel reduce on those transformation stacks (specifically for maps and vectors). The perfect intro to these ideas can be found in Guy Steele’s talk “How to Think About Parallel Programming: NOT!”, really the 2nd half.

Transducers are MUCH easier to write, much more composable, and much easier to apply in more contexts. However, transducers have not (yet) fulfilled the parallelism goals laid down by reducers, and that is still seen as an important and achievable goal. It is possible to use transducers in the reduce stage of reducers which is a partial win, but there is more that can be done.

To circle back to the original question (what are they for?), the thing reducers are great at is at applying transformation stacks on large collections with fine-grained data parallelism (where you have many independent elements to be transformed, and as opposed to coarse task-grained parallelism where ExecutorService like things are a pretty good answer). Anything described as “embarrassingly parallel” is going to be a good match, like applying the same fn to every pixel in an image. Since reducers were created a lot of things have happened in the hardware/GPU space around parallel computation, so maybe that changes the implementation details.