What's the logic behind the core language macros?

A bit of a shower-thought - but I realized recently that I don’t really get the design behind a lot of the core language macros. They all feel kinda… Lisp-y and not Clojure-y and I’ve sorta been scratching my head about it.

Just as an example, looking at let and ns

Now I’m mostly just talking about the syntactic sugar here - since it’s a macro there can be all sorts of dark magic under the hood. But the syntax sorta doesn’t help one visually parse what’s going on and see what the expected input is

Starting with let:
You have something like

(let [var1 stuff
      var2 morestuff
      var3 evenmore]
(somefunc 
  var1
  var2))

why are the bindings being done in what looks like a vector? It seems a bit misleading syntax-wise.

I see a vector and my mental image is “okay this is going to be a list of stuff”.

Sometimes it’s something a bit trickier where the index/position has some implied meaning (like in hiccup). I think that’s a bit icky b/c there is an implicit contract you need to know a priori. It’d prefer a map in that case - like you see in cljfx - but it’s a bit more verbose I guess. Here the hidden contract is that the vector needs to have an even number of terms…

But we have stuff that comes in pairs - I’d totally expecting a map

(let {var1 stuff
      var2 morestuff
      var3 evenmore}
(somefunc
  var1
  var2))

My lizard brain goes:
curly brace → pairs of stuff

The only issue is you’d need to drop Clojure’s let being a Scheme/Lisp let* b/c the order currently matters … but I don’t think that’s a biggie (it’d probably be a performance win)

The ns macro goes a step further:

(ns mynamespace
  (:require
    [something :as thing1]
    [anotherthing]
    [lastthing])
  (:import 
    CrazyJava.Stuff
    CrazyJava.Thing))

Here we have a multiarity signature where it turns out the order actually doesn’t matter (except for the first element). And then each term is a weird lists of stuff (not in a vector) but the list has a special reserved slot for a keyword in the first position. It almost seems to suggest there is some logic to having multiple lists with the same key ex: (ns myns (:require ...stuff...) (:require ...things...). (is there…?)

my lizard brain is completely flumoxed. I’m a bit embarrassed to say, but for the longest time I’d just not even bother trying to remember the quirky ns syntax and I’d copy/paste a ns block from another .clj file and then tweak it to do what I want.

There is a lot of implicit data-layout contract going on here that you just need to memorize and follow

Here I’d really expect something that looks a bit more like the deps.edn

(ns mynamespace
  {:require [[something :as thing1]
             [anotherthing]
             [lastthing] ]
   :import [CrazyJava.Stuff
            CrazyJava.Thing]})

(this can be improved on)

To me it feels more consistent this way and you immediately know what structure to expect. You also know repeating keys isn’t meaningful and the order is meaningless

I feel I’m probably the one in the wrong here and I missing some logic behind it all so before I embarrass myself too much :slight_smile: - can someone help me figure out what I’m missing?

1 Like

Your let defies ordering since maps are unordered. It could be argued that would be more akin to let from CL semantics, which treats each binding as independent (and enables the compiler to possibly emit parallel bindings), as opposed to let*. It is idiomatic in Clojure to allow let to refer to prior bindings, so vector binds are common for fn/defn/loop/for etc because order is preserved, and prior bindings can refer to earlier ones, providing something like let* as a default.

Here the hidden contract is that the vector needs to have an even number of terms…

The macro will enforce the contract inform you at macroexpansion time though, so offending code won’t compile.

I don’t really have any insight into why ns turned out the way it did though. Your example would invalidate the parsing of metadata maps in the ns declaration though:

user=> (ns  blah {:added "2022"} (:require [clojure.string :as s]))

It looks like there is a mapping of type->reference construct in the ns macro form, since strings are doc strings, a map is metadata, and sequences are import/require/etc.

Bindings allow duplicate keys, so Vector is the logical Clujure-y structure:

(let [user (get-user)
      user (update user :balance + new-balance)]
  ...)

Also order here too obviously very important.

Also, some of the older lisp, at least Emacs lisp, has bindings where order doesn’t matter but default, so I feel a binding where order matters and bindings are applied sequentially is actually a Clojure-y thing, in that it’s a more modern version of let.

As for the ns macro, I agree, It could probably had been made a bit more consistent, I always forget a bit the syntax for it, and especially the discrepancy between require and import.

1 Like

Note that the following is legal today so, again, a map doesn’t really make sense:

user=> (ns foo.bar
         (:require [clojure.string :as str])
         (:require [clojure.edn :as edn]))
nil

That’s not to say that the ns macro’s syntax isn’t one of the weirder things in Clojure… :smile:

These two examples kinda illustrate my point. The interfaces seems to allow for input that works, but I don’t see any good reason you’d want to do either of those things. (here is where I feel I’m maybe missing some key insight)

for fn defn you don’t have ordered input pairs
for loop for the interface suggests you do. But I’ve never seen the ordering used in the wild.

I just tried it with for and it does seem to work.

(for [x ['a 'b 'c] 
    y [1 2 x]]
[x y])
;; ([a 1] [a 2] [a a] [b 1] [b 2] [b b] [c 1] [c 2] [c c])

It seems confusing to reason about but it could just be a lack of familiarity on my part

For let I’d personally would have rather liked unordered lists b/c I find them easier to look at. You immediately know all the bindings are independent. With let* you’re never sure if a binding depends on previous bindings or not. The downside is the unordered let leads to more nested lets. With let* you can sort of build a pipeline with named intermediary variables which can make it easier to reason about the code

Oh okay - then that’s probably explains it. There is some method to the madness. It does seem … unique. So you could for instance put the doc string at any point in the arguments. Again, a very flexible interface that allows for strange inputs. Maybe this has some good usecases though

The ns DSL supports some weird stuff which is not necessarily recommended, both because norms changed and because it developed organically. From Clojure Governance and How It Got That Way:

17 February 2012
Alessandra Sierra

Clojure Contrib: The Beginning

The growth of contrib eventually led to the need for some kind of library loading scheme more expressive than load-file. I wrote a primitive require function that took a file name argument and loaded it from the classpath. Steve Gilardi modified require to take a namespace symbol instead of a file. I suggested use as the shortcut for the common case of require followed by refer. This all happened fairly quickly, without a lot of consideration or planning, culminating in the ns macro. The peculiarities of the ns macro grew directly out of this work, so you can blame us for that.

I bet that in an alternate universe where Rich wanted to design Clojure 2.0, the ns form would get a major overhaul. There are only a few such things, so treasure it.

I don’t see let as a similar case. Rich specifically designed the let binding to use a vector as part of designing Clojure to be a standardized successor to Common Lisp. From Clojure for Lisp Programmers at 0:29:30:

[Rich Hickey] The bracket is part of the let syntax. You have seen some Schemes start to advocate using square brackets for lists that are not calls. And then, like PLT or whatever, they treat them as lists. That same convention of using square brackets when things are not calls or operators happens in Clojure. So these things, these are not calls, so people do not get confused when they see parens everywhere, they are like, “Is this a call? Is this data? I cannot get my head around it.”

So that convention was a good one, except in Clojure, I do not want to waste square brackets on another kind of list, so I have real vectors.

4 Likes

You could at one point, but it looks like the introduction of spec and the clojure specs for the core macros ends up enforcing ordering constraints on ns forms (e.g. ns symbol ?doc string ?metadata *references).

Ah perfect, I did remember seeing that ns was one of those things that happened a bit accidentally.

No you don’t, that’s the problem. It would simply introduce a subtle bug in your code. There is no exhaustive compile guarantee that each statement is truly independent all the way in their execution, unless Clojure compiler could also guarantee purity of each one and track all their interdependent data relations.

And personally, nesting things to that extent would be atrocious, this is more a personal preference, but you can tell from at least Emacs use over that, I don’t use the let anymore, because it’s just not as ergonomic or intuitive and it causes headaches and subtle bugs like I mentioned, and use let* instead (talking about in Elisp here)

Humans are hardwired to see a sequential list of things and assume they are happening in sequence, that’s why async programming is so hard.

So it’s a bit of what do you want the special case to be, and the default. Personally I think sequential is more often what you want. And when you want to express order doesn’t matter do this:

(let [a 10]
  (exec a))
(let [b 20]
  (other-exec b))
2 Likes

EDIT: Sorry… this is a bit off the original topic…
(on a second read through I feel I maybe missed what you were trying to say)

You mean if you have some buried side-effect during a binding? I mean first of all - doing side effects during a local binding is I’m pretty sure a bad idea to start with… I could be wrong here, but I haven’t come across a great usecase…

I’m guessing you’re talking about doing it by accident - and maybe multiple times?.. and then I guess the lack of guarantee on the execution order could hypothetically make it non-repeatable? Hope I’m understanding the problem correctly.

This sounds a bit contrived and confusing/difficult to debug regardless. A (let {} ..) in that case would indeed make life more difficult

I feel a lot of the problems people have with async just wouldn’t come up in a let … the main selling point of Clojure (for me) is the immutable data structures. But it’s pragmatic and you can make your code blow up in weird ways and there aren’t really compile time guarantees of much of anything. So while it’s not a guarantee - it plays to the strengths of the rest of the language design.

I guess I really disagree. When I’m knee deep in some code I want it to be very intuitive what is the minimal amount of context I need to understand what’s going on. This always felt like something that was difficult/impossible to express in imperative/sequential languages. So in C you’d look at some bit deep in a function and you don’t know which bits higher up are relevant and which aren’t. The Scheme/Lisp let is fantastic b/c it explicitly expresses that all the terms will be independent. In combination with immutable data structures you can do that pretty safely now.

At the heart of it, it seems strange/arbitrary that the last term in the let is in some sort of limbo of “maybe it depends on all the previous terms but maybe not!”. And at least in the code I’ve written (which isn’t a huge ton - nor anything too crazy complex)… most of the time it does not. Most of the time I just need a bunch of reusable local bindings and the order I’ve written them in has no particular meaning. Just not always :confused:

When i see a let in ELisp and I look at a binding I know I don’t need to look at the other terms and I only need to maybe look higher in scope. If there is a lot of sequential interdependence then I’d probably use a let* to minimize the nesting. In Clojure most of the time I can get away with the threading macro in this case.

I can envision a complex scenario where nested lets would be a mess and a threading macro too simplistic and you’d want a let* - but I feel this would be the exception. A (let {} ..) form seems like it’d be the best fit the majority of the time.

I don’t understand this example. Don’t you again have the issue of side effects here?

I want to write

(let [a ( ... lots of code ...)
      b ( ... lots of code ...)]
  (dosomestuff b)
  (dootherstuff a b))

When I look at how b is created I want to know I don’t need to look at anything that came before - like a. If b’s binding is a complex expression then that’s not immediately obvious at a glance. I currently don’t know of a way to express that concisely.

The only way I know to make interdependence more obvious is to refactor out all the code into private single-use top level functions. But this comes at a huge ergonomic costs as you end up with a soup of minifunctions, you have to constantly jump around code and you can’t see everything relevant in one place.

Consider:

(defn foo [user]
  (let [user (f user)
        user (g user)]
    ...))

Order matters here. If you swap the two bindings, you still have code that compiles but you will most likely get a different result. Re-binding the same symbol multiple times is surprisingly common in real-world Clojure code – and we can rely on it because of the way Clojure was (very carefully, very deliberately) designed.

1 Like

@geokon-gh You brought it up in the for in your second post but otherwise I don’t see these patterns below in the thread-

I think the following example doesn’t leverage binding order to the full extent. Personally in my style I very rarely rebind the same symbol in a single block. In almost every case like the examples from @seancorfield @didibus I would just use argument threading.

(let [a 1
      a (f a)]
  ...)

However, I rely heavily on order in let blocks, like this:

(let [{:keys [a b]} some-map
      c (+ b a)
      d (fun a)]
  ...)

Start with destructuring to keep it out of function arguments, but then keep compounding data in sequence. The argument could be made for (:a some-map) (:b some-map) throughout but destructuring ends up more clear in my eyes.

Or, impromptu functions

(let [fun #(make some (transformations %))
      d (->> c (map fun) (then this))]
  ...)

Where it’s not code for reuse but nesting lambda definitions would get unintelligible quickly, and defn outside the binding would lead to confusion by an unnecessary amount of fragmentation.

I’ve had your same thought about bindings being pairs but it now seems logical that a regular hash map or even sorted-maps, which I guess would be pseudo-indexed here, would go against some of my most common use cases. Plus, instances like a destructuring map in the left position seem to go against the idea of what a key-value pair even is.

1 Like

There are side effect cases, but also immutable updates need to be done in order in many cases to be correct, like in @seancorfield’s example. You’re right, you could use a threading macro instead, but sometimes you want to name intermediate operations for readability.

I feel I have many more use cases where I need to use things from before, so I would end up with a lot of rightward drift due to nesting.

I do think it’s a bit of a personal preference. I find the current behavior more useful more often, and also safer, since I have seen some subtle bugs when using let in Elisp.

May I ask, what benefit do you get from knowing each binding are independent if you’re not planning on having them run concurrently or in parallel?

Edit: Having said all that, if you did want that behavior, I do think a map would be nice, and I would even extend let to accept a map or a vector, and if the map is used, it would be order independent, and if the vector is used it would be order dependent. That would be nicer than having a let and let* both taking a list.

The benefit is making reading code simpler. You’re minimizing the amount of context that you need to look at to understand a piece of code.

I understand you can reuse bindings, but as I tried to say before, that just seems like a recipe for disaster. It’s very easy to glance at your code, only notice the first binding, and not realize you clobber it later. To me it kinda looks like a cutesy way of introducing some local mutability to write more C-like code…

To me it really reminds me of variable masking in C/C++. It’s generally a good way to confuse yourself. I thought the science on this was settled and we all agreed it’s better to use some intermediary variable names :slight_smile:

Yeah, I’ve got lots of code like that. Where you have a series of bindings that are representing steps. If the steps are kinda complicated, where multiple parameters need to be assembled and values reused multiple places, then threading isn’t sufficient and I can kinda see the value in the current setup.

But what inevitably happens is I also need to bind some other unrelated things e and f and I end up with a soup of bindings - some dependent some not dependent.

Since it’s just a plain unstructured list this means when I come back to the code a month later I kinda need to linearly read and re-understand the whole thing to figure it out the piece I want.

Semi-related: Even if we accept the [] notation is superior there is still the outstanding problem that there is no clear visual distinction for bindings that are used in the let body and “private” bindings that are purely used to make other bindings - which I think worsens the visual soup. I don’t have a good solution here… maybe adding a - to temporary bindings names? I’d be curious if anyone has a mnemonic for this.

Call me crazy, but I think this example would be improved if it were

(let {{:keys [a b]} some-map}
  (let {c (+ b a)
        d (fun a)}
  ...))

:slight_smile: :slight_smile:
You’d immediately know c and d are independent of each other. In fact you wouldn’t be able to make them dependent.

I’m sure you could cook up a pathological example where the whole thing would drift right a lot though - but I think that’s an exceptional situation.

B/c the key is now a destructuring thingy? I guess it does extend the “key-value pair” a bit. I always felt that the way destructuring is represented with its magical :keys key was a bit jarring - but that’s a separate conversation :laughing:

yeah that’d be nice. It wouldn’t hurt any existing code. I originally was just trying to get some context for the design decision - but I can see now that a lot of people are leveraging stuff I would consider kinda no-nos or confusing. And the more points are brought up, the more I’m starting to like the {} idea haha

Here’s an example that relies on let forms being run from top to bottom:

I found it weird at first, and tried to find better solutions. I couldn’t! I either got lots of nesting or long, confusing thread-first/thread-last chains.

It’s OK to pick a pragmatic solution!

Some of the formatting is a little weird (I think to squeeze to a column limit?). In the parallel universe with curly-braces-lets a direct translation of this one isn’t so bad. It’d just need an extra internal let block

(defn opts->specified-deps
  "Returns all :deps and :alias :extra-deps for the deps.edn indicated by `opts`."
  [opts]
  (let {lib                    (some-> opts :lib symbol)
        alias                  (some-> opts :alias)
        no-aliases?            (:no-aliases opts)
        {:keys [deps aliases]} (-> (edn-string opts) edn/read-string)}
    (let {current-deps (->> deps (map (fn [[lib current]]
                                         {:lib lib :current current})))
          alias-deps (if no-aliases? []
                         (->> aliases (mapcat (fn [[alias def]]
                                                (->> (:extra-deps def)
                                                     (map (fn [[lib current]]
                                                            {:alias   alias
                                                             :lib     lib
                                                             :current current}))))))))}
        (->> (concat current-deps alias-deps)
             (filter (fn [dep] (if alias (= alias (:alias dep)) true)))
             (filter (fn [dep] (if lib (= lib (:lib dep)) true))))))

The benefit here would be that it’d be immediately clear that the first 5 terms are independent of each other
and then the last 2 terms in the inner block are derived from them and are also independent of each other

I’m guessing the right-drift is off-putting?

On personal projects I actually generally arrange code in a more vertical open style… (this goes 100% against the style guide)
I’ll just show an example (I won’t proselytize this - b/c it’s a bit half chewed) but it makes rightward drift a lot more digestible and everything aligns vertically in a way that’s super quick and easy-to-scan visually.

(defn opts->specified-deps
  "Returns all :deps and :alias :extra-deps for the deps.edn indicated by `opts`."
  [opts]
  (let {lib               (some->
                            opts
                            :lib
                            symbol)
        alias             (some->
                            opts
                            :alias)
        no-aliases?       (->
                            opts
                            :no-aliases)
        {:keys [deps
                aliases]} (->
                            opts
                            edn-string
                            edn/read-string)}
    (let {current-deps (->>
                         deps
                         (map
                           (fn [[lib
                                 current]]
                             {:lib     lib
                              :current current})))
          alias-deps   (if no-aliases?
                         []
                         (->>
                           aliases
                           (mapcat
                             (fn [[alias
                                   def]]
                               (->>
                                 def
                                 :extra-deps
                                 (map
                                   (fn [[lib
                                         current]]
                                     {:alias   alias
                                      :lib     lib
                                      :current current}))))))))}
    (->>
      alias-deps
      (concat
        current-deps)
      (filter
        (fn [dep]
          (if alias
            (=
              alias
              (->
                dep
                :alias))
            true)))
      (filter
        (fn [dep]
          (if lib
            (=
              lib
              (->
                dep
                :lib))
            true)))))))

I do admit you need a more vertical monitor :slight_smile: But the flip side is you can have more vertical panes open side by side :wink:

If your function doesn’t fit on one screen, then it’s too long :stuck_out_tongue:

2 Likes

I think you’re making a good point! Seeing dependencies as indentation is actually quite nice.

Yeah, I’d guess so. And the more dependencies you have, the more indentation you get.

I think we just disagree on the cognitive load trade offs.

To me, it’s more cognitive load to have to always ask myself if I need or don’t need order and might need to use prior bindings in later ones.

This is why I dislike Elisp’s let, everytime I use it, it takes me additional cognitive load to think through this, and make sure my operations are truly order independent from a side-effect point of view, and then make sure that I won’t need prior bindings in later ones.

Where-as personally, when I read code in a Clojure let, I never really care if the bindings are order dependent or not, or if they use prior bindings. Basically I very rarely feel like I need to worry about that to understand some code.

This is very different from C/C++, where you have cognitive load due to mutation of variables.

The cognitive load in C/C++ is that reusing a prior binding might be a bug, because you don’t know if it’s been mutated in-between.

I could maybe agree that shadowing inside the same let code maybe shouldn’t have been allowed, even though I feel it can be convenient sometimes, like in Sean’s example, cause coming up with names like user, user1, user2, also has its problems, like being confusing if we’re really dealing with 3 separate users, or the same user being updated along the way.

Basically, I don’t see the usefulness of knowing the order doesn’t matter, whereas when doing side-effects or a series of updates, the ordered let is super useful. So why not always use that one? That’s exactly what I do in Elisp, I’ve adopted let* and I’ve stopped asking myself which one I should use, I just always use let*, which is much less cognitive load again, than always having to wonder which one to use.

4 Likes

To me the highest cognitive load by far is trying to figure out and refactor what I wrote 3 months ago :slight_smile:

If you have good binding names then maybe reading is fine (you just gloss over the details). But for refactoring you end up having to understand all the bindings right? Maybe we just have different problem spaces.

I guess I agree you do have to think about it a tiny bit more ahead of time. But I’d always trade some upfront cognitive load for easier readability/maintainability later.

Reusing binding names is sort of a similar situation. It’s quick and dirty and more dangerous to refactor later. If you make a binding and then immediately mask it in the next line then it’s a bit more benign (but arguably pointless). The reader still has to read/notice/verify this each time it happens - more cognitive load. I think it’s used to make “inside-out” code easier to read, but I’d argue you generally should use threading for that (doesn’t work 100% of the time :slight_smile: )

So this keeps coming up - are you really doing a lot of sideeffects in your bindings? I mean maybe we’re just in very different problem spaces but this just doesn’t come up for me and it looks like a major code smell… but I could just be ignorant. I plainly don’t ever expect to find state changes in let bindings. It’d find that highly confusing…

In the end, sure there are scenarios where you’d want a let* b/c you need sideeffects, or the problem is complex with a ton of interdependence that’d need 4 layers of let blocks (this happens to me occasionally) - but that’s something that’s kinda exceptional and indicated a complicated bit of code. In that situation, yeah, you gotta sit down and understand all the bindings and figure it all out then. Sometimes problems are just hard. When you see a let* you immediately know you gotta be a bit more careful and understand the block top to bottom.

It’s sort of analogous to if you see a loop macro. Something complex happened, map reduce etc. were not enough, you gotta be ready for some black magic.

Check out Compiler.java.

let is not doing anything special. clojure/Compiler.java at master · clojure/clojure · GitHub

(let [a 1, b 2, a (+ a b)])

literally creates the opcodes similar to:

var a = 1
var b = 2
a = a + b

It gets the job done pretty well. plus it’s really fast.

1 Like

Yeah, I don’t think order-independence would have any bearing on emitted code. It’d just constrain the scope of valid input into the ‘let’…

Like … in theory order-independence can generate different code… with performance gains - bc the compiler can optimize the CPU instruction pipeline if it knows operations are truely independent… but I don’t think that’s something you’d be able to express in Java (I could be wrong).

It’s even tricky to do consistently in C/C++ … you typically need to massage the compiler with compiler directives to make it happen