Low hanging fruit
A recent exploration into performance tuning illuminated some things that seem approachable.
The clojure compiler could be doing a lot more to help tweak code. Something similar to common lip’s declarations for optimization, hinting, etc.
hinted destructuring
One area that leads to “death by a thousand cuts” is map destructuring and other destructuring. These are such idiomatic choices and encouraged by the language, but they bring in overhead due to defaulting to the polymoyphic get/nth accesses that are emitted. You end up having to unroll your previously idiomatic, nicely destructured code into something bulkier and arguably harder to maintain. recent performance tweaking in response to Nikita’s post on Rust seemingly trouncing clojure shows that we can do better with a macro that allows the user to intelligently inject type hints into destructuring binds to allow the compiler (in this case the macro) to smartly emit faster direct method implementations (e.g. supporting direct record/type field access, .nth for indexed types, and .get for maps). This is a naive implementation that requires more user intervention, but one that yielded significant results in practice.
Given the prolific nature of destructuring inside small functions the end up on hot paths (say inside a reduce or map or whatever), even something as simple as allowing for hinting on whether an arg supports Indexed operations, this is likely a simple but useful win. In my naive macro, I also have the option of emitting warnings for users when slow calls for get/nth are emitted (along with suggestions for hinting).
Structual Hashing of Records
This ended up being a drag on performance, since there was a lookup table built on “point” records (x,y coords). The original implementation spent a ton of time just hashing points and comparing. Switching to a custom point type with its own hashcode worked out. No idea if this can be improved in the general, but it was a non-obvious weakspot that emerged in profiling. I could maybe see some smarter defrecord logic that detects if the static fields are numeric (e.g. a cartesian point or other representation) and emits a default “fast” structural hash based on the numeric fields (assuming the non-static entry map is empty).
Field Access for Records and valAt
The defrecord implementation that emits a case
form that dispatches based on they key provided to determine quickly if there’s a static field being looked for, returning the val in constant time if so. This is slower than just invoking direct field access on the record, but faster than paying the price of hashing the key (since keywords can be checked via identity fast). Plumbing this further…for “small” records, where “small” implies a set of static fields <=8, it actually appears faster to use condp identical?
in lieu of the case
implementation, since the constants appear to favor a small linear sequence of identical?
calls rather than identity-based hash lookup that case
seems to emit.
Less Lower Hanging Fruit
Interpreter for use with Graal for Unhobbled Native Image Stuff
Eval is currently the bane of AOT native compilation. There is no infrastructure in place to opt in to low-performance interpret
that can be used with Graal while retaining access to all of Clojure’s features.
There was a master’s thesis on truffle clojure published a while back that provides a non-trivial implementation of Clojure in the truffle language framework (e.g. java-based AST definitions). No repo was ever published, but there is code inline in the thesis. This seems like a compelling option for bootstrapping something compatible with Graal’s native-image. I think if we had a truffle implementation, you’d get an interpreter (plus the JIT optimizing compiler) all in one. No idea on how much pain/incompatibility this would introduce, but it seems like a way to broaden the scope of applicable native-image apps.
Babashka kind of does this, but it intentionally provides a very limited set of interpretable Clojure.
Smaller, more portable implementation core
Thanks to the work done on ClojureScript’s largely protocol-based clojure.core, tools.reader, and tools.analyzer, we have pretty much everything needed to bootstrap a clojure implementation in a new host fairly quickly. Assuming you can get the pre-reqs for reading and evaluating these libraries into place in your proto-clojure, you should be able to bootstrap Clojure relatively easily (for various definitions of “easy”). I’ve been experimenting off-and-on with doing that with Common Lisp over the years as time permits and learning allows (also as the aforementioned resources came into being), and it’s looking like a viable way to port clojure to new hosts - some which, like CL, address problems like native-image out of the box (and allow for more advanced compilation features, like SBCL’s type inference engine and fine-grained performance options).
One thing that’s obvious during the porting process (I think this was learned during the CLJS port), is that there’s likely a very simple lower-lisp that we could reduce Clojure into. cljs.core provides most of clojure (assuming you have functions, protocols, types, etc.). It’d be interesting as a research project to distill Clojure into an implementation based on one or more of these simpler lisps and provide a minimal substrate for the host to implement for bootstrapping. tools.reader and the analyzer could similarly be distilled, to provide a fast way to get environments setup. I could envision similar ports to hosts with really nice ecosystems like Julia and any of the Schemes.
So, perhaps spending some brain power making clojure simpler to implement would be a “good thing” for extending reach (just as CLJS did) to new hosts where it makes sense.