What is really Clojure syntax?


#1

We often say that Clojure as a LISP dialect is homoiconic: The syntax of the code is the same as the syntax of the data.

I am asking myself what parts of Clojure should we consider as parts of the Clojure Syntax?

I am not sure the question is well formulated. Anyway, the purpose of this question is not academic. I is to help newcomers to learn Clojure effectively.

The way I see it, the whole Clojure language could be more or less layered in the following components:

  1. S expressions as in the original LISP
  2. EDN to support collections literals, regular expressions …
  3. Special forms: if, def, let, do …
  4. Macros defined in clojure.core: defn, defmacro, when, if-not, and, cond …
  5. Functions defined in clojure.core: map, filter, reduce …
  6. Functions defined in clojure other namespaces: clojure.set, clojure.walk …

What is the limit between the syntax and the semantics?
Is there such a limit?
Is it important to draw such a limit?


#2

In my mind, the core Clojure syntax for this use should be defined by drawing parallels from how other languages commonly do operations such as code structure, function creating, conditionals, loops and so on, as well as data structures and their common operations.

It’s true that there is no syntax for this, but in my mind (esp. as a non-native English speaker), some common Clojure functions such as assoc and conj are syntax.


#3

Intuitively, I would actually answer no to the last 2 questions, because as application programmers there is no limit to the layers of compilation of interpretation we add in our programs - and we especially do this in Clojure, what with data-based APIs and macros.

So the only distinction that doesn’t feel to arbitrary to me is that between text and data - which means there’s very little to Clojure’s syntax.

Maybe a more constructive suggestion: don’t teach beginners about syntax, teach them the notions of compilation and interpretation !


#4

What do you mean by distinction between text and data?

Why do you think this is important for a Clojure beginner to understand the notions of compilation and interpretation?


#5

Syntax is data. Semantics is functionality. Languages without “first class” notions of actions (in the @ericnormand’ian sense) and syntactical transformations over functionality, will have a harder time bending the functionality of their data to their will.

Eight bits is a byte. Both the eight bits and the byte are syntactical representations of the same data. Between them is a semantic transformation. Each layer of the computational stack are transformations from one syntax into another, usually over the same data.

So, IMO, you can’t say “this part of Clojure is syntax and this part is semantics.” It’s thousands of layers of syntax, each producing different layers of semantics. In some languages, where syntax isn’t a first class thing that can be manipulated, the syntax/semantic can seem more important. In Clojure, rather, what is semantic and what is syntax depends entirely on whether you’re on the bottom side or the top side of a particular abstraction.


#6

This is a really great question. I’m not sure the distinction between syntax and semantics is the distinction you are trying to find, though. For instance, the fact that let-bound variables are immutable is technically part of the semantics, while the fact that the bindings are wrapped in a single vector (not a list of lists as in other lisps) is syntax.

The nice thing in Clojure is that we have two data points. There’s the JVM Clojure and there’s JavaScript Clojure. In theory, there can be more. What needs to be preserved when porting it to a new host in order for it to still be Clojure? In a conversation with David Nolen a few years ago, he gave a short list of things that made Clojure Clojure. I don’t remember them exactly, but here’s my best try at reproduction in the same spirit:

The particular surface syntax
Functions
Immutable local variables
Immutable data structures (with their same semantics)
The core library
Atoms
Protocols for type-based dispatch
The protocols defined for the basic access patterns of the data structures

The rest is up to the host system.

Eric


#7

Clojure just adds four persistent collections and some core functions to the JVM, and expresses the code with four persistent collections. It has no syntax. Forcing various distinctions does not help to understand Clojure.

;Conventional thinking, chaotic logic.
(if (and (> x1 x2)
         (or (< x3 x4) (not= x5 x6))
         (keyword? x7)) 
  :t
  :f)

;Unrestricted expression, just read in order. 
;Closer to the order in which the machine is executed.
(->  (< x3 x4)
     (or  , (not= x5 x6))
     (and , (> x1 x2))
     (and , (keyword? x7))       
     (if  , :t :f))

According to Taoism, water flow is the perfect substance. The water flow is always able to assume any shape as needed, progressing in sequence, reaching the end point. The pure function pipeline data flow is like a water flow, almost close to the Tao.:sunglasses: Think PurefunctionPipeline&Dataflow First
%E9%81%93%E5%A4%AA%E6%9E%81%E5%9B%BE


#8

https://clojure.org/reference/reader


#9

I think when talking about syntax or teaching others about Clojure syntax, we don’t need to care about all built-in functions, e.g. clojure.walk/post-walk.

But it’s very necessary to know about all syntax sugars, these are very common when reading Clojure code. #"", #_, @, ^, [email protected], all these stuff will make beginner very confused.

So, Syntax, there must be a list of all you need to know, if you do, you don’t need go back to tutorials or books to look up something.


#10

For me, syntax is all code the reader can parse plus its evaluation order.

So all literals would be the first one like:

[]
()
{}
#()
#{}
'
""
symbol
:keyword
etc.

And then evaluation order rules:

(<fn> <arg1> <arg2> <...>)

Args from left to right, then fn.
Symbol auto-eval to their deref Vars.
Keyword to themselves.
Map literal eval to a PersistentMap.
Etc.

And because Macros and Special forms are exceptions to the evaluation rules, they’re part of the syntax too.

Functions and Java objects and classes wouldn’t be syntax. But interop special forms and macros and literals are.


#11
  • text: represents information as a raw sequence of characters, practical for humans
  • data: represents information as data structures, practical for programs

So in the case of compilation, source code is text whereas CSTs / ASTs are data - the advantage of Lisps being that they make the translation from one to the other evident. In this line of thought, I would describe syntax as the ‘textual’ rules for writing programs (what is a valid textual notation for data) - and semantics as the rules for interpreting that data.

Not necessarily important for Clojure beginners - but then I’m not sure the distinction between syntax and semantics is important for Clojure beginners either. It depends on where you draw the line of beginnerness. I do feel it is important, though, for the learning programmer, because a ton of what we do boils down to crafting compilers and interpreters. SICP comes to mind as a better introduction than anything I could articulate here.


#12

You ask: “What is really Clojure syntax?”

I think a fair answer is “a core syntax plus syntax sugar”. There is some core syntax, on top of which all other features are implemented.

I don’t really know what exactly the bare minimum clojure syntax is, it would be interesting to specify it. Here’s a blog post that talks how some people did exactly that for JavaScript. Since Clojure is in the LISP family, that analysis should probably be easier to perform…

Interesting related article: Desugaring scheme.

For instance, in scheme, this let binding:

(let ((v 30)) (+ v v))

is desugared to nothing more than:

((lambda (v) (+ v v)) 30)

The article shows how other more complicated forms are desugared. At the bottom of the article, there’s the question: “How far can we go?”. The answer: If we wanted to, we could desugar all the way to the simple lambda calculus.

In case someone is interested in the core language + syntax sugar approach to define languages, I first read about it in PLAI, a book about writing interpreters (using racket).