Some perspective on Clojure's primitives and the map function

Hi all,

Another thread brought up the topic of what is the concept behind Clojure’s map function and what are Clojure’s fundamental primitives? So I thought I’d offer some perspective.

By the way, some of this is intermediate to advanced knowledge, so expect that if you’re just starting out, most of it will go over your head for now. And I recommend reading this, but not worrying if you don’t get it. Go work on some problems, implement small Clojure programs, and then come re-read this again after, and it might start to make more sense.

First of all, to truly understand Clojure’s map function, and its primitive, you need to go read about the Untyped Lambda Calculus. Unlike Java, C and Python, which are modeled over the Turing Model of computation. Clojure is mostly modeled over the Lambda Calculus as its computational foundation. Basic understanding of it is thus required to understand some of the Clojure primitives.

What makes it confusing is that, there is no machine for the Lambda Calculus. So it is easy to confuse the abstract computational model of functional programming (FP), for the translation layer to encode it over a Turing machine.

Most algorithm and data-structures you learn are designed for practical computing, which can run over existing hardware, and the complexity is measured in term of the hardware itself. These will be imperative in nature, because the machines are imperative, and the Turing model is imperative.

All problems can also be computed with the Lambda Calculus (there’s a proof for it). The algorithms will be different, because the entire model is different. map being named map is actually very appropriate, even though most people think it is confusing. I’m going to get to that.

In Lambda Calculus, there are no loops, only term re-writing. Evaluation is the process of recursively rewriting terms in a giant nesting of anonymous functions. This process is called a reduction. There are no variables, no memory, no jumps, no instructions, there arn’t even any numbers or operators like addition and multiplication. In that sense, everything is immutable. And this is why in Clojure variables don’t get assigned, but bound. A let binds a value to a term, where the term will get re-written on evaluation for its value.

In that way, map is thus a function which defines a mapping between two sets of values, where the rule of the relationship is the mapping function. It is not a loop, and no mutation takes place (in the abstract, on real hardware, it could be translated into an imperative model, which will loop and make use of mutation). Map simply re-writes terms of terms of terms recursively, and in a mind blowing manner, this reduces to the same thing that a for-loop over the data would, without ever needing a single variable or jump.

You can think of map in terms of relational algebra, where it defines a one to one relation between inputs and outputs. Where as reduce defines a one to many or many to one relation. So it is often said to be declarative in that way. When you use map, you are saying: there exist a mapping between inputs of this domain to outputs of that domain, where for each input and output pair, mapping function is the relationship rule. For example: (map inc [1 2 3]). There exist a mapping between 1,2,3 and 2,3,4, where for each of 1,2 and 3; 2,3 and 4 are their increment. This relational view is how Pig, Hive, and the likes implement SQL over Map-Reduce and can scale to such high loads.

Yet, map is not a real primitive. The real primitives are anonymous functions and eval. As I said, map is simply defined as a big composition of the mapping function over a recursion over terms being re-written, which is in itself a reduction, and why map can be implemented using reduce.

Now, the Untyped Lambda Calculus says that with just eval and anonymous functions, you can implement all possible Turing complete programs. Which is kind of amazing and beautiful. It is also too clever, and operates at a level which is unnatural to people. Imagine if the dictionary only had two words, and to express any complex idea just required you to combine these two words in longer and longer combinations. It would be tedious, and inefficient.

People’s brain work at much higher levels of abstraction, and we prefer having many more words, then having fewer words but longer sentences. There’s probably a balance to be had here. We also wouldn’t want to have to memorize one word for all possible ideas. Thus English has in practice around 10 000 commonly used words, out of a total of ~171000. Similarly, Clojure adds many more words, which form common ideas, to reach a good balance between too many words and not enough. And just like the dictionary defines words using a longer combination of other words, the definition there of, Clojure defines its words in terms of combination of other words. Where as, in the start, you really just need eval and anonymous functions (lambda) as the words you have to magically know and understand to start defining all others.

Now, as I said before, there are no machine (as of yet) that can run eval and lambda directly on hardware. So for performance, Clojure words are often implemented using Java’s words, which are themselves defined in Turing Model words, not Lambda Calculus. But the Clojure words are inspired from the ones that make most sense to add to the Lambda Calculus, and not the ones that do for Turing Models. This is the fundamental difference between FP and imperative. They take their inspirations for their vocabulary from different roots, and end up being vastly different languages in that way. True as well of natural languages.

This idea of languages that have too few vocabulary and abstractions, yet are Turing complete is known as the Turing tarpit. Lambda calculus is a Turing tarpit the same way that machine code is one for Turing model languages. Even though once you understand them, you could implement any logic you want with just that, it won’t be practical.

So just like Python does not expose the true primitives of the Turing model, but some slightly higher level ones, Clojure does as well. Lets get into them.

For Lisps, normally the primitives are going to be:

  • first
  • rest
  • cons
  • read
  • print
  • apply
  • eval
  • fn
  • def
  • cond
  • quote
  • let

In Clojure, because the JVM lacks tail call optimization, I would also add:

  • loop/recur

It can be a fun exercise (if you are masochist :stuck_out_tongue: ) to try and implement things with just these. Recursion will be your best friend here.

Now, in proper Clojure, you can actually easily see Clojure’s true primitives here: Special forms are those things that are not implemented in term of other Clojure functions/macros, thus they are the true primitives. Everything that is not a special form is implemented in Clojure itself. Only special forms are implemented in Java. Now, there’s a slight cheat here which Clojure (as opposed to other Lisps) takes. One of the special form allows interop with Java code. That makes a lot of other things simpler to implement, because you can just leverage all of Java already for them. It also minimizes the translation required from Clojure to JVM byte-code for compilation. Making compilation quicker and the resulting code possibly faster. Another aspect of this is that, Clojure doesn’t give you its own words for things which Java already has almost the same ones, instead it tells you to just use interop to access them.

So again, in theory, you can implement every other function with just these special forms. In practice, you’ll want many more of the most common functions to already be implemented using the above. That way, you can start with a much larger vocabulary, enabling higher level of abstractions.

The most common ones you’ll want to learn are probably:

For a more complete set, I really really recommend going through the official Clojure Reference. In my opinion, it covers just the right amount of the most common functions/macros which together form the vocabulary that 95% of Clojure code uses.

Just like English tends to use the same common 10 000, even though it has ~171000, Clojure tends to use the same common ones found in the Reference. For the other ones, just like for English, you look them up only when you need them :stuck_out_tongue:

You’ll find that I made a lot of similarity with natural languages, and I believe Clojure has a lot of similarity to them. Learning Clojure is actually more similar to learning a natural language than other programming language in my opinion. This is because it has a simple grammar of words that compose together in sentences and paragraphs to create more complicated ideas:

(<function or macro> arg1
                     (<nested func or macro> argA

For functions, arguments are evaluated from left to right first, and then their result is passed to the outer function. They can be nested as an argument to other functions, and will evaluate bottom up. That is, the innermost function will evaluate first, and the outermost function will evaluate last. Functions always return a result. When evaluated, they are replaced by their result.

For macros, arguments are not evaluated, they are just passed to the macro directly, only the return of the macro is evaluated. They can be nested as arguments to other macros, and will evaluate top down. That is, the outermost macro will evaluate first, and the innermost macro will evaluate last. Macros always return a result. When evaluated, they are replaced by their result, but their result will further get evaluated. Macros are like language exceptions. Just like all natural languages have exceptions to their rules, well that is macros for ya. They can choose to evaluate their arguments in any order, if at all.

I hope this can help demystify some more parts of Clojure for yall.

Have fun everyone!


Oh, one more thing…

Once you begin to understand what I explained above, you’ll start to realize why Clojure has no standard forloop and why mutability is protected behind various interfaces like Atom. Basically, it takes an opinionated stance that modeling your problems closer to the way FP (and the Lambda Calculus) would have you do it is simpler, and will allow your programs to handle more complex problems, without additional defects and with keeping it simple to modify and further extend.

But, its opinion is different than that of say Haskell, which says, there is never a good reason to model things any other way then FP. Clojure says: “There is rarely a good reason to model things any other way”. Which is why other parts of Clojure will give you access to other modeling techniques:

  • You can do some form of OOP with mutli-methods, records, protocols and gen-classes/interfaces.
  • You can write tight loops by just implementing them in Java, and calling them through interop.
  • You can have imperative constructs like a for loop or a goto (there was a thread about this recently), using macros if you really need them.
  • There is a while loop, but please, don’t use it if you are a beginner (When in Clojure, do as the Clojurians :stuck_out_tongue:).
  • You can have side effects in your functions, making them into procedures, using the likes of: do, doseq, run!, dotimes, dorun, etc.
  • You can have mutation if you want: atoms, volatile, refs, transients, etc.
  • You don’t need to use lazyness: transducers, mapv, filterv, reduce, etc.
  • You can use logic programming instead: core.logic
  • Or write things using CSP: core.async
  • And so much more…

Honestly, it is hard to cover the surface of Clojure, and with the power of macros, there’s very little limits to constructs that can be added to allow you to model things using an almost infinite number of techniques. If you don’t want to be done learning, with Clojure, you can go on forever.

My recommendation, take it one at a time. Start with the most idiomatic ways, and only move to new things if you’re interested, or have good reasons to do so.

The idiomatic way being:

  • Model all your data using the persistent data structures: map, vector, set and queue.
  • Keep your functions pure and side effect free, with mutation only at the top/boundaries. Ideally, what you mutate is not in-memory, but a datastore somewhere, like a sql/noSql DB (or don’t ever mutate and use Datomic). Or mutate client side, in an in-memory DB like Datascript or a simple, single global atom wrapping a map.
  • Leverage the sequence abstractions and their functions for data transformation.
  • Use the REPL, always!
1 Like

Thanks for that write up, it’s a great explanation! Your explanation of why map is named the way it is definitely helps.

I see this as splitting hairs over terminology. There are 4 things at work with map, the input [1,2,3], the function inc, the result [2,3,4], and map tying them together. The question is the result. Where does that get created and how does it get populated? That’s part of what map does. From a purely theoretical view, maybe you could argue otherwise. I put that under the topic of “In theory, theory and practice are the same. In practice, they’re not.”

Recursion is a form of looping, so saying map recurses is saying that it loops. That’s what allows lambda calculus to implement Turing machines. Without the overhead of function calls, iterative loops wouldn’t be necessary in any language. The reason tail recursion is important in lambda languages is because it allows the compiler to convert the recursion to an iterative loop. Without that, lambda calculus based languages would be infeasible on standard computers.

They did have Lisp machines until the early 90s. The nail in their coffin was when PCs started running Lisp code faster then the Lisp machines did. I’m not sure how much of the lambda-ness was implemented in hardware, though.

That was one of the earliest exercises I did with Clojure. It was a fun little project. Highly recommended.

1 Like

Thanks @didibus for a great explanation and for providing so much context.

(Sorry–I decided to delete most the rest of the original post, which contained an unimportant, overly technical point of interest to almost no one.)

Thanks :slight_smile: ! I was hoping it would, even so a little.

I think what you see as splitting hair, I consider to be mastery. For most practical purpose, a lime and a lemon can be used interchangeably, yet to a cocktail craftsman, one would know when one is more appropriate than the other.

There’s a lot of things in CS that have very little subtleties, which can often be overlooked, until they matter tremendously. Being able to navigate these distinguishes one developer from another. A lot of innovation comes out of these subtleties. I was trying to get people to see that while map is very similar to a for-each loop, it is different in some fundamental ways, and this is why you can perform distributed MapReduce, but not distributed ForEach loops.

Map comes with more structure than your typical for-each loop. It is pure and immutable. And it maps each element independently of others. These properties can be leveraged to automatically parallelize the operation. Which is why in Clojure you have pmap. You can not take any arbitrary for-each loop and parallelize it.

Don’t get confused here, some languages use the name for-each for the map function. Such a language is C#, which offers a Parallel.ForEach loop. But this is really the map construct, not the imperative foreach construct. And it is subtly different to the standard C# foreach loop.

So foreach and map, similar, but not same.

When is it splitting hair to point out differences in things that are similar in a lot of ways? That will depend on the practicalities of their differences. If for what you need, they can both be used with equal operational characteristics, sure, then it might be splitting hair, but if they can’t, well now it matters to understand how they differ.

I can’t disagree with that, and I was trying to make it very clear that the FP constructs are translated into imperative under the hood, as the computer itself is programmed using instructions over memory.

That said, I think you are underestimating the power of abstractions. It’s really hard to grasp abstractions, and especially to understand their usefulness. It is especially hard for me to explain it. But, abstractions are at the root of Computer Science. In fact, most of our work is creating abstract models of the real world in order to achieve some practical benefit.

The lambda calculus and the Turing model are equivalent, yes. Recursion and imperative loops have equivalences, yes. Yet, they work at very different abstractions. And yes, that does matter, even in practice. It matters for our ability to understand and solve problems.

Clojure’s sales pitch is mostly that if offer tools to work at different abstractions then traditional imperative and OO. Why should you care to work at different abstractions, if they are all equivalent? This is really hard to explain and transfer to someone else. But it has to do with your ability to understand, reason, and solve problems. If you are satisfied with you current ability to do all that, then there’s very little value to learning Clojure.

So, here’s the deal… If you keep saying recursion is just a loop, that means you are translating from the recursive abstraction in your head, into a loop, and back. It means you cannot yet think in terms of recursion, but only in terms of loops. This is like someone learning English coming from say Spanish, for a while, you will understand English, but you can’t think in English, you just very quickly translate from English to Spanish and back.

Now, yes, on x86 hardware, recursion will be translated into some JUMP over memory. Everything is just assembly/machine-code under the hood. It is useful to know that as well, and understand that translation, specifically for performance reasons. Maybe this is all you meant to say, if so, I apologize. But for other readers, I want to make it very clear, abstractions are a means to think and reason about problems, so you can better solve them. A lot of abstractions have equivalences with others, or are even fully equivalent. Yet, some problems are easier to solve with recursion, while others are easier to solve iteratively. Even when its the same algorithm. I’m not sure why, but I’d guess its because they map more directly to the problem itself. Each abstraction is like looking at your problem from a different angle, and sometimes that’s enough to get a better idea for it, and find the solution.

From what I know, Lisp machines didn’t implement anything remotely like lambda calculus on hardware. They were just normal computers with instruction sets geared toward Lisp.

Something to keep in mind here is that Lisp != FP. Most Lisps are very imperative. I think they might have been initially used as the playground to explore Lambda Calculus, maybe with the rise of Scheme, but I’m not sure. Lisp is another kind of abstraction all to itself, which is useful to learn and understand, often known as meta-programming, and in regards to Lisp, specifically related to homoiconicity.

Here’s another fun one :stuck_out_tongue: :

(def lambda-map
  ((fn [f]
     (f f))
  (fn [ff]
    (fn [mf coll]
      (if coll
        (cons (mf (first coll))
              ((ff ff) mf (next coll)))

(lambda-map inc [1 2 3])
;; => (2 3 4)

I thought maybe it be good for me to break down that last example a bit:

(def lambda-map
  ((fn [f]
     (f f))
   (fn [ff]
     (fn [mf coll]
       (if coll
         (cons (mf (first coll))
               ((ff ff) mf (next coll)))

(lambda-map inc [1 2 3])
;; => (2 3 4)

This is map implemented in terms of anonymous functions and eval, similar to how it would be done in lambda calculus. You can think of this a bit like the assembly language of functional programming.

First, lets look at the recursive implementation of map:

(defn rmap
  [mf coll]
  (if coll
    (cons (mf (first coll))
          (rmap mf (next coll)))

(rmap inc [1 2 3])
;; => (2 3 4)

It works by:

  1. First unfolding every element of coll until there are no more. This is done by (rmap mf (next coll)).
  2. As things get unfolded, the mapping function is called with the element, and the result is captured. This is done by (mf (first coll))
  3. When the end is reached, (if coll triggers false, and an empty list is returned by '().

The evaluation thus becomes:

(if [1 2 3]
  (cons (inc (first [1 2 3]))
        (if [2 3]
          (cons (inc (first [2 3]))
                (if [3]
                  (cons (inc (first [3]))

Which becomes:

(cons (inc 1)
      (cons (inc 2)
            (cons (inc 3)

Which becomes:

(cons 2
      (cons 3
            (cons 4

Which becomes:

(2 3 4)

This illustrates the process of reduction (eval) I was talking about. In this case, eval is just replacing terms from the nesting of rmap.

Now that we understand the recursive approach. Let’s revisit the lamba calculus one. In that one, we are not allowed to use named functions. Thus we can’t self-reference. All we have are anonymous functions.

The trick to accomplish this is thus to use even more anonymous functions! We still need to call from inside our function that does the mapping, another function that is identical to it. But instead of using a reference, we first write a function that takes a function and returns our mapping function which calls the passed in function with the passed in function as argument:

(fn [ff]
    (fn [mf coll]
      (if coll
        (cons (mf (first coll))
              ((ff ff) mf (next coll)))

See how the outer function returns the inner function? And the inner function is the mapping function? What will happen is that ff will be the outer function. So the outer function will be passed to itself, and thus when the mapping function calls ff it will be passing the outer function to the outer function, and ff being the outer function, will return the inner mapping function. To do this bootstrapping, we use:

(fn [f]
  (f f))

By calling this function with the outer function from before, what happens is that this function will now call the outer function with the outer function as an argument to the outer function. The outer function will thus return the inner function (our mapping function) with ff bound the the outer function. And there we have it, ff is now the outer function, and the inner function when calling ff with ff as an argument basically get itself back, allowing it to expand into a series of call from mapping function to nested mapping functions.

I know, it’s really confusing, and hard to wrap your brain around, but it’s pretty clever!

Enjoy :stuck_out_tongue: