Would loop and recur exist if Clojure was build on top language with no StackOverflows?

Hi Clojurists,

I’m coming from Elixir that was built on top of Erlang where something like this is OK:

defmodule MyApp do
  def sum_numbers([head | tail], accumulator),
    do: sum_numbers(tail, head + accumulator)

  def sum_numbers([], accumulator),
    do: accumulator
end

IO.puts(MyApp.sum_numbers([1, 2, 3, 4, 5], 0))
# => 15

Explanation:
When the collection has at least a single element the first function splits the collection to first and rest. If the collection is empty then the second functions is executed instead and it simply returns the accumulated number (sum).

Learning Clojure I see that recur has two reasons to exist. First is that if you rename the function, you don’t need to rename the recursive call. Second is that it prevents StackOverflow since Java is not Erlang and stack has a limit.

And loop is to a) make it simple by not exposing (in case of the previous Elixir function) the accumulator and b) no need to write two function arities.

In Elixir loop would look like this:

defmodule MyApp2 do
  def sum_numbers(list),
    do: sum_numbers(list, 0)

  defp sum_numbers([head | tail], accumulator),
    do: sum_numbers(tail, head + accumulator)

  defp sum_numbers([], accumulator),
    do: accumulator
end

IO.puts(MyApp2.sum_numbers([1, 2, 3, 4, 5, 6]))
# => 21

and of course the Clojure implementation would be

(defn sum-numbers [seq]
  (loop [seq seq sum 0]
    (if (empty? seq)
      sum
      (recur (rest seq) (+ sum (first seq))))))

(sum-numbers [1 2 3 4 5])
;; => 15

or

(defn sum-numbers2 [seq]
  (loop [seq seq sum 0]
    (if-let [tail (next seq)]
      (recur tail (+ sum (first seq)))
      (+ sum (first seq)))))

(sum-numbers2 [1 2 3 4 5 6])
;; => 21

One of the reasons I’m asking is that based on watching Rich Hickey’s talks, it seems that some parts of Clojure were shaped to push a developer into right direction so he/she uses the language properly. And I’m thinking if there is some hidden meaning behind loop and recur to push people to the right direction or some hidden use case that I don’t know about.

Btw. thank you all for Clojuverse, really great tool and content and a lot of great reading and great people.

Leonardo

P.S. Something that can’t be done in Clojure is multi-function recursive call.

Instead of just calling itself

fn-abc ------┐
  ^          |
  └----------┘

in Elixir you can add another function into the game

fn-abc ------┐
  ^          v
  |        fn-def
  |          |
  |          v
  └----------┘

The example bellow is stupid, I couldn’t think of a good example, but I remember that I’ve seen good use cases for it.

defmodule MyApp3 do
  def sum_numbers(list),
    do: sum_numbers(list, 0)

  defp let_me_know(list, accumulator) do
    IO.puts("VERBOSE: sum: #{accumulator}")
    sum_numbers(list, accumulator)
  end

  defp sum_numbers([head | tail], accumulator),
    do: let_me_know(tail, head + accumulator)

  defp sum_numbers([], accumulator),
    do: accumulator
end

IO.puts(MyApp3.sum_numbers([1, 2, 3, 4, 5, 6, 7]))
# VERBOSE: sum: 1
# VERBOSE: sum: 3
# VERBOSE: sum: 6
# VERBOSE: sum: 10
# VERBOSE: sum: 15
# VERBOSE: sum: 21
# VERBOSE: sum: 28
# => 28

That’s trampoline in Clojure!

Rich Hickey commented that unintentionally recurring from a non-tail position is a potentially severe programming blunder. The explicit recur form (with or without loop) avoids the problem by giving the compiler an opportunity to say “No! This is not tail position!”

In both Erlang/Elixir and Clojure, if the recursive call is not in a tail position the stack will grow. The difference is that the JVM has an explicit bound, whereas in Erlang it will grow until it consumes all memory available to it. Your choice which is better - personally if given the choice between the two I prefer to fail early.

In languages that implement TCO, or as Joe Armstrong puts it, “Last Call Optimization”, automatically doing this can be a source of programmer error where they modify a function that previously was tail call into not a tail call, and their program explodes on large data. It turns the position of the recursive call into an implicit semantic of your program. An explicit recur call makes this explicit, which might make it one less thing you have to juggle in your head when you’re thinking about your problem. :slight_smile:

FWIW, for some cases allowing the stack to grow is fine; many real world problems have a reasonably small amount of iterations required to solve that it will fit in the JVM stack boundaries. It’s typically only in the general cases like when implementing seq operations that you need to account for large stacks.

1 Like

Some nice answers to a qestion I asked on Stackoverflow a while ago provide additional context: clojure - Why can't tail calls be optimized in JVM-based Lisps? - Stack Overflow

I think it probably started as a necessity, to implement an easy form of TCO which the JVM didn’t natively support, but I think it’s since grown into a nice explicit variant that guarantees you you’re doing a real O(1) memory loop with compiler guarantee. Otherwise it can be easy to think you’re doing O(1) memory and actually you’re not. I’ve also found it a pretty nice way to learn about the difference and understand the two different type of recursion.

As others have said, the JVM doesn’t actually have a real limit on stack overflow, it’s a configuration, you can set it to be the machines full memory if you want, but generally if your loop grows to reach the limit, you’ve done something wrong. To increase the default stack size limit you can set the Xss option when running your app.

1 Like

I didn’t know that one. Thank you.

Yeah, I missed the correct terminology for describing recursion. So we have TCO which is OK since languages are optimized for that and mutual recursion that might be bad.

But you need to use recur in Clojure to get the stack optimization. Not that recur is wrong, just saying that even if you have a call to itself on the last line of a function , you’ll get StackOverflow in Clojure which means stack optimization needs recur.

So the hidden meaning that I was looking for might be a) be explicit and b) recur does not have an option for mutual recursion since mutual recursion is no good

Thank you!

You can have TCO with mutual recursion, but it’s hard to implement on the JVM. Other functional languages do it automatically.

Also, what’s needed for TCO to be an option that a compiler can implement efficiently (if e.g.the JVM doesn’t make it difficult) is not that the recursion happen on the last line of a function. Roughly, what’s required for TCO is that there can’t be function calls waiting to process the result of the recursive call. The recursion has to be the only thing waiting to happen.

I don’t think mutual recursion is bad. Sometimes it’s exactly what you need. My understanding is that trampoline is slow, but if you don’t need TCO, nothing is neccessarily wrong with direct mutual recursion. You can even do it within a function with local function definitions using letfn. You just have to know that you’re not going to have to recurse very far; there are lots of situations like that.

Here’s a silly example to illustrate the difference between a tail recursive function and one that’s not. TCO won’t be performed on either, since Clojure quite reasonably doesn’t perform TCO. I want to illustrate the concept, though, and Clojure is familiar here. If we translated the code into another language TCO might be performed.

(defn oh-no-1 [] (println "Oh, no!  We found a zero!!") nil)

;; not tail-recursive
(defn classify-until-zero-1
  [xs]
  (let [x (first xs)]
    (cond (empty? xs) nil
          (pos? x)  (cons :p (classify-until-zero-1 (rest xs)))
          (neg? x)  (cons :n (classify-until-zero-1 (rest xs)))
          (zero? x) (oh-no-1))))

(defn oh-no-2 [results] (println "Oh, no!  We found a zero!!") results)

;; tail-recursive
(defn classify-until-zero-2
  [acc xs]
  (let [x (first xs)]
    (cond (empty? xs) nil
          (pos? x)  (classify-until-zero-2 (conj acc :p) (rest xs))
          (neg? x)  (classify-until-zero-2 (conj acc :n) (rest xs))
          (zero? x) (oh-no-2 acc))))

Here they are in action:

user=> (classify-until-zero-1 [1 -1 2 -2 0 3 -3])
Oh, no!  We found a zero!!
(:p :n :p :n)
user=> (classify-until-zero-2 [] [1 -1 2 -2 0 3 -3])
Oh, no!  We found a zero!!
[:p :n :p :n]

The tail recursive calls are in classify-until-zero-2, in the lines beginning with (pos? and (neg?. Notice that nothing has to happen after the recursive calls return, even though they are not at the end of the function definition. (classify-until-zero-2 has an extra argument, acc, for accumulator. You can avoid this by wrapping the function in another function, for example.)

By contrast, in classify-until-zero-1, a cons has to be performed after the recursive call returns. That’s why it’s not tail recursive: there is a function waiting to act. Those waiting function calls will build up on the stack. When TCO is possible, as in classify-until-zero-2, the stack can be manipulated by the compiler so that that doesn’t happen.

I included two recursive calls in each function to show that tail recursion doesn’t necessarily have to do with a single recursive call. I included the oh-no functions to show that there can be other branches that call other functions, without necessarily breaking tail recursion. (I didn’t illustrate mutual tail-recursion.)

I wouldn’t say mutual recursion is not good. Same with stack consuming recursion. Some alrogithms need a stack, and are easier to implement recursively, because you can take advantage of the call stack. Others don’t need a stack, and that’s when you want to use TCO, because why consume stack memory when the algorithm doesn’t need too, it’s just a waste.

In practice, very few things need mutual recursion, or need to recurse across function boundary. In fact I can’t come up with any example. Which is why it’s not a big deal that recur doesn’t work for those. It doesn’t mean it wouldn’t be nice if it could, but adding support for that is non trivial. And for when you really need too, trampoline is there, and in other case you can often just refactor things into a single function.

The lack of TCO for mutual recursion and across function boundary is a weak point of Clojure, it would be better with it, but it’s a rare and minor inconvenience in practice.

That said, the use of recur over directly calling the function by name to use TCO is an advantage in my opinion, because when you know your algorithm doesn’t need a stack, having the compiler tell you if TCO is in place or not is very convenient. It’s also pretty nice to not need to wrap a function in another for intermediate carry data, you can hide that in the loop/recur:

(defn classify-until-zero-2
  [xs]
  (loop [acc [] xs xs]
    (let [x (first xs)]
      (cond (empty? xs) nil
            (pos? x)  (recur (conj acc :p) (rest xs))
            (neg? x)  (recur (conj acc :n) (rest xs))
            (zero? x) (oh-no-2 acc)))))
1 Like

This is super explanation. Thank you.
It’s very easy to understand that you will end up with (cons .. (cons .. (cons .. and therefore it has to use stack even when you use Erlang that doesn’t need recur function.

Could you please tell me is it possible to verify that the first example without recur adds into stack since Clojure needs explicit recur (like in the second example)? Is there some cleaver way how to for example println stack?

You can do:

(-> (Thread/currentThread) (.getStackTrace))

To get the current thread stack. The first element is the top of the stack, which is the last invoked function, and the last element is the bottom of the stack, which is the first invoked function in the chain.

2 Likes

This is super cool @didibus thx :partying_face:

(require '[clojure.string :as str])

(defn sum-numbers-bad
  ([xs] (sum-numbers-bad xs 0))
  ([xs sum]
   (if (empty? xs)
     (do
       (println "#### DONE: " (str/join " ## " (.getStackTrace (Thread/currentThread))))
       sum)
     (do
       (println "#### loop: " (str/join " ## " (.getStackTrace (Thread/currentThread))))
       (sum-numbers-bad (rest xs) (+ sum (first xs)))))))

(sum-numbers-bad (range 1 50))

;; #### DONE:  java.base/java.lang.Thread.getStackTrace(Thread.java:1602) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:6) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:10) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$sum_numbers_bad.invokeStatic(NO_SOURCE_FILE:2) ## user$sum_numbers_bad.invoke(NO_SOURCE_FILE:1) ## user$eval139.invokeStatic(NO_SOURCE_FILE:1) ## user$eval139.invoke(NO_SOURCE_FILE:1) ## clojure.lang.Compiler.eval(Compiler.java:7181) ## clojure.lang.Compiler.eval(Compiler.java:7136) ## clojure.core$eval.invokeStatic(core.clj:3202) ## clojure.core$eval.invoke(core.clj:3198) ## clojure.main$repl$read_eval_print__9112$fn__9115.invoke(main.clj:437) ## clojure.main$repl$read_eval_print__9112.invoke(main.clj:437) ## clojure.main$repl$fn__9121.invoke(main.clj:458) ## clojure.main$repl.invokeStatic(main.clj:458) ## clojure.main$repl_opt.invokeStatic(main.clj:522) ## clojure.main$main.invokeStatic(main.clj:667) ## clojure.main$main.doInvoke(main.clj:616) ## clojure.lang.RestFn.invoke(RestFn.java:397) ## clojure.lang.AFn.applyToHelper(AFn.java:152) ## clojure.lang.RestFn.applyTo(RestFn.java:132) ## clojure.lang.Var.applyTo(Var.java:705) ## clojure.main.main(main.java:40)
;; => 1225

vs much less “text” (stack)

(defn sum-numbers-good [xs]
  (loop [xs xs sum 0]
    (if (empty? xs)
      (do
        (println "#### DONE: " (str/join " ## " (.getStackTrace (Thread/currentThread))))
        sum)
      (do
        (println "#### loop: " (str/join " ## " (.getStackTrace (Thread/currentThread))))
        (recur (rest xs) (+ sum (first xs)))))))

(sum-numbers-good (range 1 50))

;; #### DONE:  java.base/java.lang.Thread.getStackTrace(Thread.java:1602) ## user$sum_numbers_good.invokeStatic(NO_SOURCE_FILE:5) ## user$sum_numbers_good.invoke(NO_SOURCE_FILE:1) ## user$eval142.invokeStatic(NO_SOURCE_FILE:1) ## user$eval142.invoke(NO_SOURCE_FILE:1) ## clojure.lang.Compiler.eval(Compiler.java:7181) ## clojure.lang.Compiler.eval(Compiler.java:7136) ## clojure.core$eval.invokeStatic(core.clj:3202) ## clojure.core$eval.invoke(core.clj:3198) ## clojure.main$repl$read_eval_print__9112$fn__9115.invoke(main.clj:437) ## clojure.main$repl$read_eval_print__9112.invoke(main.clj:437) ## clojure.main$repl$fn__9121.invoke(main.clj:458) ## clojure.main$repl.invokeStatic(main.clj:458) ## clojure.main$repl_opt.invokeStatic(main.clj:522) ## clojure.main$main.invokeStatic(main.clj:667) ## clojure.main$main.doInvoke(main.clj:616) ## clojure.lang.RestFn.invoke(RestFn.java:397) ## clojure.lang.AFn.applyToHelper(AFn.java:152) ## clojure.lang.RestFn.applyTo(RestFn.java:132) ## clojure.lang.Var.applyTo(Var.java:705) ## clojure.main.main(main.java:40)
;; => 1225