Organizing Clojure code - A real problem?

bsless · May 15, 2021, 4:33pm

Interesting, Shantanu, thank you
I find parallels between your post and the Clean Architecture as well. It looks like in the discipline of software engineering and architecture the same ideas keep getting hit on from different directions, similar to the blind men and an elephant parable.
This makes me wonder, too, if we’re still missing an essential truth which we have not reached yet, something which definitively “solves” this problem.
The fact we have 4-5 if not N+1 dependency injection and state / application management frameworks hints at this problem.
As of now, and I don’t think this has been properly solved in other languages, either, there is no way or model to enforce what we “know” is the correct layered structure of:

domain model
business logic
scaffolding / conveyance (actually moving data from A to B)

Where every layer only uses the one above it.

If we look at Component, for example, it looks like the “correct” way to construct and model a system is with zero knowledge of business logic, i.e. the components are completely generic and are parametrized with their behavior.
This leaves us, however, with very funny looking and obtuse components, which should also be very small:

(defrecord Server [server handler options]
  Lifecycle
  (start [this]
    (assoc this :server (start-server handler options)))
  (stop [this]
    (stop-server server)
    this))

This gets interesting where the handler itself depends on some states and dependencies. It’s cleaner to separate it out to its own component:

(defrecord Handler [handler make-handler options]
  Lifecycle
  (start [this]
    (assoc this :handler (make-handler this options)))
  (stop [this] this)
  clojure.lang.IFn
  (invoke [this req]
    (handler req))
  (invoke [this req resp raise]
    (handler req resp raise)))

Where make-handler knows how to extract the dependencies injected to this and construct the handler correctly.
We can take it further, even, and look at how Aero and Component can work together, leading to the question - why does every component need to take options as an argument? Since records are open and we can merge the configuration into each component in the system map, we can strip down the components like so:

(defrecord Server [server make-server]
  Lifecycle
  (start [this]
    (assoc this :server (make-server this)))
  (stop [this]
    (stop-server server)
    this))

(defrecord Handler [handler make-handler]
  Lifecycle
  (start [this]
    (assoc this :handler (make-handler this)))
  (stop [this] this)
  clojure.lang.IFn
  (invoke [this req]
    (handler req))
  (invoke [this req resp raise]
    (handler req resp raise)))

This creates components which are completely and intentionally devoid of business logic, and frankly, of almost all implementation details. What makes the Server record special now? Or the Handler? Nothing:

(defrecord State [state]
  Lifecycle
  (start [this]
    (let [make-state (:make-state this)]
      (cond-> this
        make-state (assoc this :state (make-state this)))))
  (stop [this]
    (let [stop-state (:stop-state this)]
      (cond-> this
        stop-state (assoc this :state (stop-state this))))))

(defrecord Callback [state]
  Lifecycle
  (start [this]
    (let [make-state (:make-state this)]
      (cond-> this
        make-state (assoc this :state (make-state this)))))
  (stop [this]
    (let [stop-state (:stop-state this)]
      (cond-> this
        stop-state (assoc this :state (stop-state this)))))
  clojure.lang.IFn
  (invoke [this a]
    (state a))
  (invoke [this a b]
    (state a b))
  (invoke [this a b c]
    (state a b c)))

Then they have meaning only by way of how the system is organized, while components are defined solely in terms of behaviors they implement:

(configure
 (make-system
  :server (using (State. nil) {:make-state make-server
                               :stop-state stop-server
                               :handler :handler})
  :handler (using (Callback. nil) {:make-state make-handler
                                   :connection ,,,})))

(defn make-server
  [{:keys [handler] :as options}]
  (start-server handler options))

(defn stop-server
  [{:keys [state timeout]}]
  (server-stop state timeout))

(defn make-handler
  [{:keys [connection]}]
  (ring/handler ,,,))

But this seems like over engineering to the umpteenth degree.

On top of that, you still have to make sure your functions are pure (read: don’t transfer any component to a function which is not a “constructor”) and still have no enforcement of correct order of dependencies.

Polylith tries to enforce this correct organization by other means, with the cost of forgoing any approach or enforcement towards state management.

The more I think about this subject, I just end up with more questions.
Would love to hear everyone’s thoughts on this.

didibus · May 15, 2021, 8:02pm

This is my issue with this library having been called “Component”. It is not meant to design a “Component” in any way, it doesn’t represent a way to build abstractions like that. It simply allows you to work with a reloaded workflow, as far as I understand everything from Stuart Sierra, this was the goal of Component and the sole driver for it.

I say that because this whole idea of “Component” I think confused people, because a Component is often used in software architecture term, and now we have this confusion of things. A Component in an application should serve to solve a part of the domain problem so that their combination can drive user features. Thus in software architecture, it is not generic to the domain, though you can be in the domain of providing generic features.

The domain model also is confusing, because in DDD it is the combination of data + business logic.

I think in Clojure it is best to use a different mental model. First off, you have data, and you need to define how you will model data relevant to your domain and what your application should capture of it for what it intends to do.

I think here we’re all mostly using domain model to refer to data model, and maybe the latter would be more clear.

The data model will be hardest to change, and it couples everything that uses it strongly together. Data dependencies are most important, and I highly recommend data flow diagrams in that sense.

What data you need to capture, how you should represent it, and where will it come from and go? That’s crucial to figure out. And because you can fail at doing this at first and can’t predict all future needs, it’s critical that your data modeling tool is flexible and can evolve here, which is absolutely Clojure’s best strength.

Now it’s possible to have an implicit data model, in that you have no explicit definition of what data you have and how it is structured, no record of where it comes from or goes, etc. Especially in Clojure right, unless you use Spec, Records, Schema, Malli, etc… Even then, it doesn’t mean you shouldn’t think about it.

Now that you have data, you should focus on “pure business logic”, which is purely the world of deriving more data from existing data and transforming data. Given some data I restructure it in another shape. Given some data I derive more data. This could be given a balance and a disbursement, I add to the balance the amount of the disbursement.

Finally you come to the “impure business logic”. This is the world of moving data around from one place to another, and having machine do things based on data. All the complexity lies here, though your problem starts before in your failure to having separated this impure business logic from its pure part or failed to model the data it’ll leverage properly.

Assuming you’ve succeeded at this split, the impure logic is best modeled as workflows or state machines (both are two side of the same coin). Now this part is non trivial, and I think maybe in Clojure, we spend so much time teaching people to seperate their pure business logic and to model their domain data, that we forget to teach anyone how to build these impure workflows/pipelines/state machines.

It is only in this latter part that dependency injection becomes a tool, or that libraries like Component come into play.

That is, appart for cross-cutting concerns, another aspect maybe in the Clojure community we don’t educate enough. How do you log in your pure business logic? How do you monitor its behavior at runtime? Etc.

Now in my opinion, the impure business logic implementation emerges automatically if you successfully modeled your data and separated the pure logic out of it.

But doing so is hard, and what tends to happen is interleaving pure/impure causes people to break the split, which means future recombination of behavior become difficult, coupling appears, code becomes rigid and inflexible, and changes to one part breaks others in unexpected ways.

Reuse and parameterization is the other challenge. If you have 10 different user commands, but they all share 50% of the same process, do you create a sub-workflow? A sub-state-machine? For that “shared” piece, and then use it inside the 10 others?

Or maybe instead of having 10 workflows, one for each, you have one workflow with branching behavior based on parameters passed?

I don’t think there’s an easy way out here, that’s the challenge and you need to make judgement calls, each have pros/cons. And often time the complexity comes simply from accruing more and more user features.

That said, I’d love to see more discussions around code organization around these. How do people implement those workflows? Pipelines? State-machine? In fact, do people prefer workflows over state-machine or vice versa? Or do you organize them in the code? How do you pass dependencies between them? How do you interleave pure/impure behavior, how do you handle cross-cutting concerns, how do you handle errors, retries, rollbacks, transactions, etc.

kumarshantanu · May 16, 2021, 8:44pm

@bsless You have a pertinent question. In my understanding, we are trying to figure out the parts and their composition, but do not have a definitive answer. Let me share my mental model on this:

Pure functions have this property of referential transparency, which makes it possible to compose them and be able to reason about them. Purity is kind of better understood than side effects because the latter lack referential transparency and (arguably) cause greater uncertainty. We handle the uncertainty arising from side effects by anticipating the failures and handling those cases in code. The bigger challenge we are talking about here is how to compose all those pure and impure parts to prepare the runtime–the composition tools we currently have are heavy (concept and syntax wise), brittle, invasive, etc. The only thing I would like to point out here is since all the tools are not created equal, it may be useful to evaluate them as such for the use cases in hand.

I think @didibus has addressed several aspects of this problem type in nuanced details. He hits it home where he asks to take decisions based on the situation and context. The open questions he raises are something I agree we should discuss more often. As a community, we can learn from our attempts to solve such problems.

In the Functional core, Imperative shell model (w.r.t. my blog post) the side effects come into play only in the imperative shell, which is made up of Content initialization and Runtime phases. This is an opportunity to divide and conquer the problem–each phase can use focused approaches that suits the problem, e.g. the context initialization phase can probably afford a fail-fast approach that leads to a less overhead of composition. The runtime phase of course needs better abstractions and mechanisms for complex cases. If we can offload more things to the Context initialization phase it automatically unburdens the Runtime. I think Software Design For Flexibility and APL (language) are good places to borrow composition ideas from when we need those.

system · November 15, 2021, 8:44am

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.