Feedback wanted on new clj-kondo macroexpansion feature

Clj-kondo is a linter based on static analysis. This means it doesn’t eval any code. Therefore clj-kondo can be very fast, but it also comes with some limitations with regards to macros that have irregular syntax compared to core macros.

So far the solution to this has been to provide configuration for :lint-as or the :unresolved-symbol linter.

This works great for the majority of cases but especially with the :unresolved-symbol config you lose a bit of linting feedback, i.e. false negatives.

What if we could have a DSL to inform clj-kondo about the syntax of a macro? And what if this DSL was just (a subset of) Clojure itself? So far clj-kondo did not eval any code, but what if we could stretch the limits of static analysis a bit?

I’ve come up with a way to integrate the Small Clojure Interpreter inside clj-kondo so you, as a user, can provide a function that expands a macro invocation into constructs that clj-kondo knows about.

A demo. What if we had a macro that is similar to next.jdbc/with-connection:

(defmacro weird-macro [[_sym _val _opts] & _body]
  ::TODO)

An invocation looks like:

(foo/weird-macro
 [x :foo {:weird-macro/setting true}] ;; x is unresolved
 (inc x))

but alas, clj-kondo doesn’t recognize x as a binding.

With the new feature a user can provide a transformation function that receives a map with :sexpr, the invocation and returns a map with :sexpr, the expanded invocation:

(ns bar
  {:clj-kondo/config '{:macroexpand
                       {foo/weird-macro
                        "
(fn weird-macro [{:keys [:sexpr]}]
  (let [[[sym val opts] & body] (rest sexpr)]
    (when-not (and sym val)
      (throw (ex-info \"No sym and val provided\" {})))
    {:sexpr `(let [~sym ~val] ~@(cons opts body))}))
"}}}
  (:require [foo]))

With this configuration clj-kondo understands the x is a binding:

(foo/weird-macro
 [x :foo {:weird-macro/setting true}]
 (inc x)) ;; type error

It now even understands that x is bound to a keyword, hence you get a type error when calling inc on it.

This feature is not yet released, but you can experiment with it in the branch for this issue. I’d like your feedback on this.

See the issue: https://github.com/borkdude/clj-kondo/issues/811

3 Likes

There are a number of macros that clojure-lsp users have wanted to handle, it would be useful to know if this system will be able to handle them:


https://clojure.github.io/clojure-contrib/branch-master/cond-api.html

slingshot.slingshot/try+
schema.core/defn
midje.sweet/fact
midje.sweet/facts
net.cgrand.enlive-html/deftemplate
schema.macros/try-catchall

Did a test for better-cond:

(ns foo
  {:clj-kondo/config '{:macroexpand {better.cond/cond
                                     "
(defn process-pairs [pairs]
  (loop [[[lhs rhs :as pair] & pairs] pairs
         new-body ['cond]]
    (if pair
      (cond
        (= 1 (count pair)) (seq (conj new-body lhs))
        (not (keyword? lhs))
        (recur pairs
               (conj new-body lhs rhs))
        (= :let lhs)
        (seq (conj new-body :else (list 'let rhs
                                       (process-pairs pairs)))))
      (seq new-body))))

(def f
  (fn [{:keys [:sexpr]}]
    (let [expr (let [args (rest sexpr)
                     pairs (partition-all 2 args)]
                 (process-pairs pairs))]
      {:sexpr (with-meta expr
                (meta sexpr))})))"}}}
  (:require [better.cond :as b]))

(let [x 10]
  (b/cond
    (= x 1) true
    :let [y (inc x)]      ;; binding is recognized
    (= 11 y) (subs y 0))) ;; yay, type error because y is not a string

Screenshot 2020-06-01 20.40.29

2 Likes

Does it work to quote the function? Instead of making it a string? Quoting would be nicer, because editors would still give you structural editing, highlighting, auto-complete, etc.

Also, what’s the idea behind having it take and return a map? Just to leave the door open in the future to pass it more keys or let it return more keys? Over say having it just take the sexpr as arg?

And one more thing, Spec is pretty powerful when it comes to performing the syntax validation of macros. Would it be possible to leverage it here? If a macro already defined a spec in the code for it? Could the spec be used to validate the call site?

Instead of making it a string?

Right now I’ve chosen the solution that when the code is a single line that doesn’t start with a paren, it’s interpreted as a file path relative to the .clj-kondo directory:

So "macroexpand/weird_macro.clj" resolves to .clj-kondo/macroexpand/weird_macro.clj where you then can provide the source for weird-macro.

The reason quoting doesn’t work is that this is part of the .clj-kondo/config.edn file and EDN doesn’t support arbitrary code.

Also, what’s the idea behind having it take and return a map? Just to leave the door open in the future to pass it more keys or let it return more keys? Over say having it just take the sexpr as arg?

Yes. Maybe the input map may also get access to locals. Maybe the result map can also expose :finding, a list of things that get reported by clj-kondo as warnings and errors.

Spec could be useful for validating macro syntax, but I’d like to keep that out of scope for now. The main problem that’s being addressed here is macroexpansion so clj-kondo understands custom macros.

Ya, those all make sense to me. For spec, I just brought it up since your examples were adding custom validation errors as well. So I’d say it looks pretty good to me.

Here is a demo for Rum. Note: the code is in a string only for the demo. In a real config it’s going to be placed inside its own file.

(ns foo
    {:clj-kondo/config '{:macroexpand {rum/defc "
(def f (fn [{:keys [:sexpr]}]
         (let [args (rest sexpr)
               component-name (first args)
               args (next args)
               body
               (loop [args* args
                      mixins []]
                 (if (seq args*)
                   (let [a (first args*)]
                     (if (vector? a)
                       (cons a (concat mixins (rest args*)))
                       (recur (rest args*)
                              (conj mixins a))))
                   args))]
           {:sexpr (with-meta (list* 'defn component-name body)
                     (meta sexpr))})))
"}}}
    (:require [rum]))

(rum/defc with-mixin
  < rum/static
  [context]
  [:div
   [:h1 "result"]
   [:pre (pr-str context)]])

(with-mixin 1) ;; no unresolved symbol
(with-mixin a a a a) ;; unresolved symbol and arity error for with-mixin

Screenshot 2020-06-02 16.03.21

One more thing I’m thinking.

Say you are a library author, what’s the way in which you would distribute those macroexpand? If you wanted to make your lib play nice with clj-kondo?

Would there be a systematic way? Or does it have to be a description in the readme?

Maybe macro writers can expose their expansion code in their source directory:
src/clj-kondo/macroexpand/mylibrary/namespace/weird_macro.clj
Same for configs:
src/clj-kondo/config.edn
Then clj-kondo can --collect these things into the users .clj-kondo directory maybe by appending the library configs to the user config and copying the macroexpansion code.

5 Likes

Perhaps I’m missing something, but isn’t this basically just providing another defmacro for the same form? Wouldn’t it make more sense just to run the actual defmacro code via SCI, and reject (or rather, don’t expand, which means you’re no worse off than currently) macros which can’t be run in that way? I imagine that most macros will be pure and should work via SCI, and those that won’t are presumably for reasons that mean you’d rather not expand them.

Regardless of this, I think that just expanding macros is not always useful since you can’t relate the expanded code back to the original source, which for Cursive’s use case at least is very important. I talked about this exact problem here. The TL;DR is that Racket can do this much more intelligently since its macroexpander is much more sophisticated. This may not be as important for clj-kondo though, I’m not sure.

I also feel pretty strongly that we should be moving away from parsing macro forms by hand and we should be using grammars to parse them - I spoke about that over here.

4 Likes

@colinfleming Thanks for taking the time to respond.

The reason I want people to provide a “stand-in” transformation function (which is similar to the macro, but can also be a way more simplified version of the macro):

  • clj-kondo has to be able to run with partial information. so even if it hasn’t analyzed the source of the original macro, it can still provide good feedback
  • performance: people can provide a simplified macro which will be more performant to expand than the original one
  • side effects: macros can have side effects (although not recommended), e.g. it can create vars at analysis time, etc. The stand-in macro won’t.
  • many times macros will refer to other namespaces in the codebase, or functions not supported in sci. clj-kondo doesn’t scan your filesystem for other files when linting a file, it just analyzes one file at a time.

Regardless of this, I think that just expanding macros is not always useful since you can’t relate the expanded code back to the original source

I’ve also run into that problem when experimenting more. E.g. when converting source nodes to a sexpr, numbers, strings and keywords lose their original location metadata. Instead of a macro-expanding transformation (from rewrite-clj source nodes to a sexpr to a transformed sexpr back into source nodes) I might move away from sexprs altogether and let people deal with the source nodes directly. Unfortunately it’s not as ergonomic as dealing with sexprs, but at least all the location information will be preserved correctly.

I’ll watch your talk links later in the weekend.

Preliminary docs:

Now released as clj-kondo v2020.06.21
See the release notes here.

Blogged about it here: https://blog.michielborkent.nl/2020/06/21/clj-kondo-hooks/