Our codebase has hundreds of derived Vars - How to make it REPL friendly?

Hi! Recently I’ve been learning more about efficient REPL Driven Development, and I’d like to change our codebase to be more REPL friendly.

But there’s one thing I’m struggling: derived Vars.

Our codebase has a tons of places where we have a map or collection that refers to functions, things like:

(defn handle-one-fn ,,, )
(defn handle-other-fn ,,, )

(def handlers
  {:handle-one-thing handle-one-fn
   :handle-other-thing handle-other-fn}
(def routes
  [{:method :get
    :path "/path-to-endpoint"
    :handler endpoint-handler}
   ,,, ])

…and so on.

Earlier, this hasn’t been a problem, because we’ve been using reloaded.repl workflow, but since our codebase has grown, reloading the namespaces has started to take quite some time.

I’m asking advice, how should I deal with derived vars? Should I manually go through all the places where derived vars are used and add the var reader #' in them? Or is there another way to handle this?

I’m also thinking, does adding the #' reader have a performance effect? And yet another question, how to prevent someone accidentally adding a new derived var in the future? Is there a clj-kondo linter for these? :slight_smile:

One way to deal with this that isn’t too noisy is wrapping the functions in anonymous functions.

(def routes
  [{:method :get
    :path "/path-to-endpoint"
    :handler endpoint-handler}
   ,,, ])

instead becomes

(def routes
  [{:method :get
    :path "/path-to-endpoint"
    :handler #(endpoint-handler %)}
   ,,, ])

;; or

(def routes
  [{:method :get
    :path "/path-to-endpoint"
    :handler (fn [x] (endpoint-handler x))}
   ,,, ])

Might get a bid tedious with many arguments but works reliably since the functions will always use the latest fn in the var instead of putting that into the map directly.

3 Likes

Yes, a small overhead, because you’re introducing an extra level of indirection – but that’s also true of @thheller 's suggestion to wrap functions in anonymous functions.

I personally prefer the #' style, although I don’t know if that’s portable to ClojureScript. The function wrapping approach is portable across both.

At work, we use the #' approach when we are building routes and middleware but to make it fully REPL-friendly, with middleware you need to return expressions that reference named functions rather than just anonymous functions:

;; normal middleware pattern
(defn wrap-foo [handler]
  (fn [req]
    ... code to manipulate req, call handler, and manipulate the response ...))

;; needs to become:
(defn- handle-foo-request [handler req]
    ... code to manipulate req, call handler, and manipulate the response ...)

(defn wrap-foo [handler]
  (fn [req]
    (handle-foo-request handler req)))

That’s ensures that you can still redefine handle-foo-request via the REPL and have that change take effect while the system is running (a change to wrap-foo will not take effect because when the middleware stack is assembled, it’s full of closures that each accept req)

4 Likes

#' should definitely be avoided in CLJS. It creates a “temporary” var with lots of “garbage” metadata that cannot be removed by :advanced compilation. If you have many of those they’ll definitely have an impact on build size. Performance not so much though.

8 Likes

It depends on the workflow and how you’re using the namespaces. For jvm codebases, a quick and dirty way is:

(require <working-ns> :reload)

This will reload a given namespace. If you’re only working in one namespace (preferable at the top level), it will reinitialise that namespace. Derived vars may also be created with custom macros so reloading the namespace will solve this problem as well.

If you are working on a namespace that is being depended on by another top-level namespace, you can do:

(do (require <working-ns> :reload)
    (require <toplevel-ns> :reload))

If you need to do it often enough and most of the derived vars have been isolated into say 3 or 4 files, then it’s easy to extract the above code into another function that you use in your workflow.

If you want something fancier and more generic. you can use a library to get all namespace dependents/dependencies of a given namespace and call (require :reload) in order of how they were initially loaded to refresh all the vars.

1 Like

Yes for functions, because that only works for Vars pointing to functions, in that only Vars in function position will get recursively auto-dereferenced. If you use a var anywhere else, you get the Var and need to dereference it yourself:

(def a 10)
(def b #'a)

(+ 1 a) ;; 11
(+ 1 b) ;; Error
(+ 1 @b) ;; 11

(defn a [] 10)
(def b #'a)

(+ 1 (a)) ;; 11
(+ 1 (b)) ;; 11
(+ 1 (@b)) ;; 11

It also gets weird for macros, if you invoke them through the Var, you’re using them like functions, and need to pass the &env and &form as well:

(defmacro a [e] `(+ 1 ~e))

(a 2) ;; 3
(#'a 2) ;; Wrong arity
(#'a nil nil 2) ;; (clojure.core/+ 1 2)

And as you see, when used as a function it returns you the expanded code, instead of replacing the code by the returned code and returning the result of evaluating that.

That’s why for derived Vars that derive from a Var containing a value it is better to make it into a function:

(def a 10)
(defn b [] a)

(+ 1 a) ;; 11
(+ 1 (b)) ;; 11

(def a 11)

(+ 1 a) ;; 12
(+ 1 (b)) ;; 12

In that sense, wrapping in a function always works, it’ll work for values, for functions and for macros even, and I guess like @thheller said it seems a better convention in Clojurescript as well, though it comes down to personal preference, for functions adding #' is definitely visually shorter and maybe more clear, but you need to remember to wrap in functions for values and macros.

It does, it’s one additional lookup each time, and I don’t believe it gets removed even if you use direct-linking.

I’d be curious to benchmark against wrapping in functions with direct linking turned on. The JIT does some function inlining, so I wonder if it could get rid of the indirection when optimizing for functions, but couldn’t for Vars.

I don’t think there is one that I know off.

Personally, I use #' for functions and I just reload namespaces for everything else, and I never really expect people to pass in macros like that so never had to care. But now that you mentioned it, maybe I should consider changing my style to defaulting to wrapping in functions.

Also, for reloaded workflows, shout-out too: GitHub - aroemers/redelay: Clojure library for first class lifecycle-managed state. as a good solution for derived state at the REPL.

Edit: And I guess if you wanted to be fancy, you could have a macro that wraps in functions at dev time, but that just passes the thing as-is at prod time.

2 Likes

The lookups are not that big a deal, adding minimal overhead.

(defn add [a b] (+ a b))

(with-out-str
  (time (dotimes [i 10000000]
          (add 1 1))))
=> "\"Elapsed time: 261.616819 msecs\"\n"

(with-out-str
  (time (dotimes [i 10000000]
          (#'add 1 1))))
=> "\"Elapsed time: 291.486518 msecs\"\n"

Custom macros need special care because they operate at compile time. If the macros change (especially if they do any kind of code rewriting), then any vars that reference the macro need to be reinitialised in order for the change to propagate.

1 Like

Thanks for all the answers!

I think I’ll go with #'. I think it’s a bit less noisy syntax and works with any function arity. I don’t really have to worry about portability to CLJS. And I think the issue of vars becoming obsolete has mainly happened with functions, not with other values. So #' it is.

It seems so! I did a quick test and compiled a jar with direct-linking turned on and indeed, the var #' var thing was still slower than a plain function. I wonder why that is…

Yeah, this is nice idea. It would eliminate the performance impart, but on the other hand, if the impact is only minimal, maybe not worth it :slight_smile:

Yeah, that’s a shame but also something I kinda expected. I was hoping for something that could solve this issue once and for all. It’s pretty easy to accidentally add a derived var without even noticing it. And then you only notice it when you’re changing a function, evaluate it, try run some higher-level code that uses that function and see the old behavior and think “why the heck this didn’t change” :slight_smile:

A clj-kondo linter for derived vars could be neat, but I’m not even sure if that is possible/feasible.

I’m not an expert in linters, but I feel it could do some of it, say it finds a ref to a non-function var from inside a def, or if it finds a ref to a function var from inside a map {} literal.

1 Like