Top-Down, Imperative Clojure Architectures

thomascothran · July 23, 2024, 2:26am

I’ve written an article on the problem with imperative-style architectures in Clojure.

A preview:

When I first became interested in functional programming, a more experienced engineer told me: “you know functional programming doesn’t really amount to much more than procedural programming.” As I insisted on the benefits of map, filter and reduce, he simply shook his head. “You’re thinking in the small. Go look at a large real-world application.”

It took some time for me to see what he meant. My preferred language, Clojure, is a functional language. But too often it is used to build top-down, imperative applications. This negates the value proposition of functional programming: isolating side effects, local reasoning, and system composition.

Here’s the sort of application structure I have in mind:

Pure functions are indicated in green. Red indicates side effects.

Let me know what you think!

rtb · July 23, 2024, 6:55am

I’ve not worked with Clojure professionally but this looks like every web application I’ve ever worked on. Lots of tight coupling and jumping through hoops to test things.

I can see why it would be tempting to do the same thing in Clojure!

My team and I are constantly pulling in frameworks/libraries to get things done quickly. The downside is the concretion of imperative patterns in the codebase.

I look at our codebases after a while and struggle to reproduce state to see issues. There’s usually an expert in all the quirks of the system and the team are screwed when that person leaves.

From the outside I probably idealize Clojure but at least there’s an attempt to acknowledge the problems with imperative code.

maxweber · July 23, 2024, 7:03pm

Thanks a lot for sharing. How would you build a system that avoid becoming an imperative-style architecture?

thomascothran · July 23, 2024, 9:49pm

Probably a lot of different ways to do it. Things I’ve used in the past and like include a hexagonal style and event-sourcing. The former is more general purpose than the latter, and I’d be cautious about using event sourcing outside of certain specific domains.

Clojure Applied has some interesting examples that uses queues (core.async channels) to decouple components.

maxweber · July 24, 2024, 8:20am

Thanks a lot for the book recommendation. I bought Clojure Applied today to read the chapters “Creating Components”, “Compose Your Application” and a few more.

I used core.async a lot, and I really like the concept. A few times we caused downtimes on our production system due to overlooking synchrounous/blocking calls in a go block, even found such an issue in the Datomic Client library. If enough requests hits your server it starts to cause a global dead lock, since the complete thread pool of core.async is saturated with those blocking operations. The book states on page 102:

Threads are scarce and expensive resources. They consume stack space and other resources, and they’re comparatively slow to start. When these threads block for I/O, we waste those system resources.

While this is true for platform threads its not for Java’s new virtual threads. Nowadays we prefer virtual threads over go blocks, since the former can free an underlying platform thread when a blocking operation was invoked in a virtual thread. And it can warn you about virtual thread pinning if an library for example was not yet prepared for virtual threads.

I also used event sourcing multiple times in the past but somehow it always ended in a mess Especially, it is less forgiving regarding domain modeling mistakes. In a database you might just migrate the current state to one that is compatible with your code. With event sourcing you either need to modify your “immutable” events or your code always needs to know how to handle legacy events in the case you like to replay the events to calculate the current state of an aggregate. We prefer Datomic since it kind of provide the best of both worlds.

The biggest leap forward for me in the recent years was this talk by David Nolen:

Here a mini example of mine how one might split the interaction with an API in many small steps. While the world library is in part already obsolete in regard what we use for our SaaS, its Readme still describes my main modification to David’s approach. Each of our step functions always takes a map as input and returns it with additional entries. Subsequent functions should not modify existing map entries to avoid of creating similar downsides like global state. Many people might think keeping the intermediate results is a waste of memory, but for us its super valuable to log this complete data (as nippy files) in the case of an exception. If you can take a look at the intermediate results it becomes less challenging to understand and fix a bug that happened on production. This week I tried to solve a bug where we not yet capture the intermediate results. Therefore I didn’t had the data returned by the third party API. To get this data I needed to carefully assemble many things via a production REPL, which took quite some time and was a bit dangerous. But in other cases you might never again have the chance to observe the relevant data, then you add a few more log statements and hope that next time the bug occurs you have captured all relevant data. For that reason we just try to capture all the data with the described approach. For people who lean into typed programming languages it might feel uncomfortable if your function get passed a gigantic map. But for Clojure I think its a superpower, especially if you have tools like portal to conveniently inspect larger data structures. And of course in Clojure data is king, so that you can also use the REPL to inspect bigger maps with ease.

We also use step functions to assemble our system like ring handlers, routes, etc.

However, while you can move all pure steps into a prepare phase, I still not found a good way to make the overall system less imperative.

didibus · July 29, 2024, 6:39pm

You will probably like this thread: How are clojurians handling control flow on their projects? - #9 by didibus

I also recommend this paper: https://www.cse.chalmers.se/~rjmh/Papers/whyfp.pdf

In general, there are two approaches:

Keep the side-effects/imperative at the top layer
Inject the side-effects/imperative bits, so you can inject pure alternative when needed (like in tests)

Linus_Ericsson · August 1, 2024, 1:15pm

Socratic question: Where is your application logic placed in the application structure you outline above?

It is easy ending up building an application/infrastructure that essentially becomes a Strangler fig pattern around a unsuspecting SQL database, maybe with more exposed reified transactions. Is it a good idea to you spend you time and effort re-inventing a transaction engine system with versioning? Well, if you do, Clojure is at least an exceptionally good tool to do so!

To really solve the problem of reading, updating and caching over maybe some transactional relational database and some key-value-store for speed for your particular application, you will sooner or later have to build some transaction manager. This is a compiler, of sorts, that schedules the updates in a way which is correct enough for your use case. This is what a transactional database systems do, and do great, but they are not always that easy to jack in to external data sources like a cache layer.

Sooner or later you will want some kind of reified transactions and versioning (like Datomic) with some kind of transaction report queue which makes selective caching conceptually possible. Again, this is how transactional databases usually manages transactions internally.

In a Datomic-like model the “gets” will be versioned and can be cached in the application. The update logic for the transactional parts will be quite different and probably more low level than in SQL. Update logic for other data sources will still have to be programmed with care (a transaction compiler/side effect manager might not be a crazy idea, especially for the combinatorics explosion that is error handling).

In some sense it all boils down to what parts of the system has access to the transaction coordination mechanisms. The system parts that doesn’t have that information, will have to put a lot of effort in implicitly making sure that the transaction log shapes according to the incoming requests (or put that cognitive load on the users of the system).

system · January 31, 2025, 1:16am

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.