Macro calling a user-provided function at expansion time

emlyn · December 16, 2019, 11:06am

What’s the best way to pass a function into a macro, so that it can be called at macro-expansion time?

For context, I’m trying to develop the idea posted by @plexus here. Basically it’s a macro that takes a Java class and generates Clojure functions for all the public methods. I’d like to provide the ability to filter the methods during macro expansion to reduce the number of generated functions, in cases when not all of them are needed.

I’ve distilled the problem down, and basically I want to do something like:

(defn things [_] ["abc" "bcd" "cde" "xyz"]) ;; get things to define
(defn thing-name [thing] (symbol thing)) ;; given a thing, what should the name be?
(defn thing-body [thing] thing) ;; given a thing, what should the definition be?

(defmacro defthings [klazz filter-fn]
  `(do ~@(->> (things klazz) ;; get the list of things
              (filter (resolve filter-fn)) ;; only keep the ones we actually need
              (map (fn [thing] ;; generate definitions for them
                     `(def ~(thing-name thing) ~(thing-body thing)))))))

This works when filter-fn is a var:

(defn my-filter [thing] (re-find #"b" thing))

(macroexpand-1 '(defthings nil my-filter))
;; (do (def abc "abc") (def bcd "bcd"))

But not if I try to use any more complex expression, such as a function literal:

(macroexpand-1 '(defthings nil #(re-find #"b" %)))
;; Unexpected error (ClassCastException) macroexpanding defthings at (REPL:1:1).
;; clojure.lang.PersistentList cannot be cast to clojure.lang.Symbol

Obviously, filter-fn is no longer a symbol, so it can’t be resolved. If I replace (resolve filter-fn) with (if (symbol? filter-fn) (resolve filter-fn) filter-fn) so that it’s only resolved when it is a symbol, that still fails, but with a different error:

(defmacro defthings [klazz filter-fn]
  `(do ~@(->> (things klazz)
              (filter (if (symbol? filter-fn) (resolve filter-fn) filter-fn))
              (map (fn [thing]
                     `(def ~(thing-name thing) ~(thing-body thing)))))))

(macroexpand-1 '(defthings nil #(re-find #"b" %)))
;; Error printing return value (ClassCastException) at clojure.core/filter$fn (core.clj:2817).
;; clojure.lang.PersistentList cannot be cast to clojure.lang.IFn

Now the function definition is being treated as its unevaluated form. So maybe I have to eval it?

(defmacro defthings [klazz filter-fn]
  `(do ~@(->> (things klazz)
              (filter (eval filter-fn))
              (map (fn [thing]
                     `(def ~(thing-name thing) ~(thing-body thing)))))))

(macroexpand-1 '(defthings nil #(re-find #"b" %)))
;; (do (def abc "abc") (def bcd "bcd"))
(macroexpand-1 '(defthings nil my-filter))
;; (do (def abc "abc") (def bcd "bcd"))

That looks much better, but because eval uses an empty lexical environment, this still fails if the expression uses anything from a surrounding let binding:

(defn make-filter [pattern]
  (fn [thing] (re-find (re-pattern pattern) thing)))

(let [pat "b"]
  (macroexpand-1 '(defthings nil (make-filter pat))))
;; Syntax error compiling at (REPL:2:34).
;; Unable to resolve symbol: pat in this context

Although this almost works, it doesn’t feel right. Is there a better way to achieve what I want?

The other option would be to use a dynamic var. It would mean any calls to the macro that require a filter-fn would have to be wrapped in a binding form, but by its nature that would mean that no matter how the function is defined, it will have been evaluated by the time the macro sees it:

(def ^:dynamic *filter-fn* (constantly true))

(defmacro defthings [klazz]
  `(do ~@(->> (things klazz)
              (filter *filter-fn*)
              (map (fn [thing]
                     `(def ~(thing-name thing) ~(thing-body thing)))))))

(let [pat "b"]
  (binding [*filter-fn* (make-filter pat)]
    (macroexpand-1 '(defthings nil))))
;; (do (def abc "abc") (def bcd "bcd"))

So, in describing my problem, I think I’ve convinced myself that using a dynamic var is the best way forward. But if anyone has any thoughts on this, or can think of a better way, I’d be interested to hear them.

plexus · December 16, 2019, 11:19am

I don’t think there’s a reliable way to do what you are trying to do. Your macro gets expanded before the function is ever evaluated, so you can’t access runtime values that only exist at evaluation time. It’s a chicken and egg problem.

Your last example works because you are calling macroexpand-1 explicitly at runtime. Does it still work without the macroexpand? I don’t think so.

Consider this:

(defn foo [filter]
  (binding [*filter-fn* filter]
     (defthings nil)))

This gets expanded well before foo ever gets called, so it can’t yet know what value will filter will hold.

I think if you really want to do this you will have to limit it to a subset of use cases, where you don’t rely on runtime values.

emlyn · December 16, 2019, 11:39am

Thanks, you’re right. Even without it being inside a function, it still ignores the filter:

(let [pat "b"]
  (binding [*filter-fn* (make-filter pat)]
    (defthings nil)))
;; #'user/xyz

all of the things have been defined (including the last one, xyz, which is returned to the repl).

I’m trying to use this to wrap the Apache Spark API, where many methods have a Java-specific and a Scala-specific version (e.g. one taking a java.util.List, and another taking a scala.collection.Seq, so I want to be able to filter out the Scala-specific ones to reduce clutter.
For that I don’t actually need any runtime values, so I guess I can just require that the filter-fn is a symbol resolving to a var, and fail if anything else is passed in.

plexus · December 16, 2019, 12:12pm

Even then you’ll have to be careful, as the function needs to be fully defined then by the time your macro is expanded. This would fail:

(declare filter-fn)

(defthings ... filter-fn)

(defn filter-fn [...] ...)

If it’s just for your own use case then that’s probably fine but be aware that you’re doing things that will violate the principle of Least Surprise.

emlyn · December 16, 2019, 1:01pm

Thanks, I accept that care has to be taken here, but I do think that having some way to filter the methods would be useful in general, and accepting a predicate function is the only reasonable way I can think of to achieve that.

I think as long as it’s clearly documented, it’s not so surprising that the function must be fully defined before being used, as it is affecting the compile-time generation of functions, not the run-time behaviour of them. I can always check that the var is bound to a fn at expansion time, and throw a meaningful error if not.

system · June 16, 2020, 1:01am

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.