Organizing Clojure code - A real problem?

ericnormand · April 22, 2021, 2:52pm

I have a question that you may be able to help me with. I will give a bit of background, ask the question, then give some of my thoughts on the answer. I hope it starts a discussion as I think it is an important topic.

Background

I am often asked, “How do you organize Clojure code?” They say they’re writing some code and it’s starting to get hard to manage. I don’t think I have a good answer. I want to say:

“I start with one file. As it gets too big, I try to break it up into cohesive namespaces.”

Somehow that doesn’t satisfy the asker. But I don’t really have any better answer. It works for me.

But I get asked this question often enough that it seems to be a real problem for people. There might be a helpful course in it.

The question

So, the question is: What is the answer to organizing code in Clojure?

I would love your answers, either arm-chair or from direct experience.

Exploration

I’ve thought about this question a lot. I’d like to share some of the ideas I’ve explored. None of them, to me, seem to be the real solution. I am very open to new potential solutions and critiques of these ideas.

1. Many people are used to lots of structure

A lot of people learned to build applications within a large framework such as Rails. In a Rails app, you use Model-View-Controller (MVC) to build your application. What MVC means in practice is that you characterize parts of your code as either M, V, or C, and then you always know where to put it.

In contrast, when coming to Clojure, there is no such given structure. In fact, you just write a -main function and write whatever you want. Are they uncomfortable with the amount of freedom? Are we, unbeknownst to us, using some structure that has just not been made explicit? I get the impression that learning MVC was an epiphany and they want a similar epiphany in Clojure.

2. They’re writing bad Clojure

After being asked how to structure Clojure code, I have on occasion asked to look at their project. Invariably, when I open the file, I can immediately see 2x-5x code size improvements. They’re missing some very fundamental ideas, like using map, filter, and reduce chains. They’re not using the standard library.

Another common problem is “over modeling”. What could be a simple “move data from this table format to that table format” is over-engineered. Entities are defined, with corresponding operations, that are totally unnecessary to the task.

In short, if I had written it, it might be 100 lines, in one file. There wouldn’t be an organizational problem. I’m not trying to brag, but it explains why I can’t see the problem. They see it as “well, my code is a mess, I should structure it better” and I see it as “use the standard library, powerful idioms, and only model when it’s useful”.

How can we get them to see how to improve their code instead of trying to “structure” it? Does this also suggest a Clojure rule of thumb: If you are worried about structure, check that you’ve made it as concise as possible?

This one seems the most promising to me as a real solution, yet it basically means I would have to tell them “get better and it will not be a problem”. Not very satisfying? Perhaps better: “I often rewrite my code several times in the initial phases. Re-write it again, but look for opportunities to use map, filter, reduce, and other standard functions.”

3. They’re beginners at design

In my “natural” answer, I said, “break the code up into cohesive namespaces.” Cohesive is an advanced topic. Maybe finding where to break it up is where they’re having trouble. I have spent a lot of time breaking up code in different places. I’ve built an intuitive sense of places that seem to work for me. They haven’t gone through that, so the word cohesive means very little to them.

Also, if they’re coming from an OO background, they’re probably used to cohesion around a set of data fields, not around purpose, as we might do in Clojure. For instance, I tend to put all the code dealing with the WordPress API in one namespace. But they’re used to breaking up the Author and Article classes into different files, and spreading the WordPress stuff to each of those “entities”.

Could this be the topic of a tutorial? How to break up a 500-line program into namespaces? What does cohesive mean to a Clojurer?

4. Code location is a security blanket

Java programmers are very constrained in where a method can live. Even if Java is super verbose, you know that the getAuthor() method must be in the Article class’s file. In Clojure, namespaces are just bags of functions that could operate on any kind of data. In fact, you could arbitrarily divide a namespace into two namespaces and things would still work. Are Java programmers so used to this constraint that they cling to it, even if it doesn’t really confer benefits?

5. I’ve never worked in a large system

It could be that the problem is me: I’ve never worked on a large system where the structure was a dominant concern. I’ve worked mostly in startups, with small codebases, and on my own small projects. Could it be that Clojurists who do work on large systems do have this problem? Do they have answers?

6. Anticipating the future

When a codebase is small, it doesn’t need much structure. But as it grows, it does. As we start to add this structure, how do we know that the structure will scale? Is it the correct structure? Perhaps the askers of the question are anxious about whether the structure last. Will they have to re-structure it soon? Will that be hard?

I, on the other hand, think of scaling as a kind of phase shift. Moving to a higher scale necessitates a reorganization. Clojure lends itself to easy reorganization. But maybe they don’t know that. Maybe they’re worried about structuring themselves into a corner when that’s not really a problem.

Your ideas

I hope this starts a big discussion. I’ve thought long enough about it. I realized that I probably should have started this discussion sooner, to get more varied input. What do you think? Have you been asked this question?

teodorlu · April 22, 2021, 3:04pm

Great question. I hope you get some great answers.

I’m still struck by this. I prefer writing Clojure to other languages. Yet, I have a harder time structuring my Clojure code. It feels harder to consider both implementation (how to do it) and interface (how to provide it) at the same time.

Zack Tellman’s Elements of Clojure helped, but I still feel like a kid when I try to decide what to do after creating deps.edn or project.clj

PEZ · April 22, 2021, 3:45pm

It is indeed an important question. I guess it depends a lot on if you are writing some scripts, building a back-end API, a mobile app, and so on. (Not sure how much it depends, but like for you, structuring my code has never been a particularly tricky part. Not in Clojure, nor in most other languages. I just stick code where it makes sens. )

However. With large systems of course it gets important. And with Clojure you want to be able to use the same REPL all over. There is a design, and a tool, that helps with that a lot. Yes, Polylith:

https://polylith.gitbook.io/polylith/

There is a #polylith channel over at the Clojurians Slack, which I follow to try to understand and learn more about this exciting idea.

seancorfield · April 22, 2021, 6:14pm

I was trying to answer this for a new Clojurian recently on Slack and, like others here, I struggled to give a good answer. There really isn’t much guidance in this area – I suspect Clojure Applied and the handful of Domain-Driven Development (for Clojure) talks out there are probably about as detailed as it gets (which isn’t very).

We started with Clojure over a decade ago and, at first, we were using Clojure purely as a “library” language, replacing small, low-level parts of our legacy apps with calls into new small, low-level Clojure functions and namespaces. We divided the code up very roughly into “i18n”, “environment”, “data”, and a handful of other library-like things that were “obvious” but as the Clojure code continued to grow (as we rewrote more and more of our legacy code) we realized we had a pretty unstructured “blob” of namespaces.

Over the years, we’ve made several attempts to come up with “better” organizations and at one point we wrote a document describing all the “business entities” in our domain and got consensus across the business teams on the terminology and structure, and then we worked to align at least some of our namespace structure around that business domain document. It helped a bit, but we still had a lot of “legacy” Clojure code that used old terminology (for example, over time the business has used “wink”, “flirt”, and “like” to mean similar things – an expression of interest without associated text – that all had slightly different semantics in terms of implementation so we have all three terms in the database schema and the codebase at this point!).

We’ve also experimented with different application structures over the years, for the non-business sections of things (software components with lifecycles, Ring handlers, routing, logging subsystems, etc) and that has led to some diversity across subprojects in our monorepo without any real consensus on what is “good” or “better”. Some apps follow a fairly traditional MVC pattern, some are organized much more along domain concepts. Given that namespaces have no inherent semantic “hierarchy”, it’s hard to say which approach is easier to work with, in the long term.

We are looking at Polylith. As I noted in my recent blog post about our monorepo structure, I used the Polylith approach for a new, small daily cron job process that we needed and I like that it encourages a broad, shallow organization, and forces you to think harder about naming components (“bricks”) and about keeping things simple and focused. We’re planning to tease apart several of our “blobby” subprojects into a series of Polylith components to (hopefully) simplify our dependencies and make the code structure easier to work with. It’s too early to say whether it will be “better”.

So that’s a very long-winded way of saying “Great question! I don’t know the answer!”. I’m looking forward to reading what others have to say on this topic.

didibus · April 22, 2021, 7:47pm

I do think #1, #2, #3 and #4 are all a part of it.

On Slack, I recently showed someone. They had this example:

 (ns fruit.protocols)
 
 (defprotocol FruitConvert
   (make-juice [fruit])
   (make-tart [fruit]))

 (ns fruit.juice)
 
 (defrecord Juice [color])

 (ns fruit.apple
   (:require [fruit.protocols :as p]
             [fruit.juice :as j]))
 
 (defrecord Apple [name]
   p/FruitConvert
   (make-juice [_]
     (j/->Juice "red")))

And didn’t know how to structure things. So this was my advice:

Hum… What I normally do is define my entire domain in a single namespace I call domain.clj. What I put in there is only the data schemas, so it would be data specs or records. Then the rest of my application would be defining operations over this domain data. If I want conversions between my entities for example, I could define a conversion.clj namespace.

Now honestly, I wouldn’t use protocols for this, I’d just have normal functions called X->Y, X->Z, Y->Z, etc. Its much clearer to me that way.

Do you have a reason to want a protocol? Like do you need to have a generic conversion where you don’t know what you are getting out of multiple possible type and need it converted to a Z ?

That would be the question I think that can answer your question. If you want “one of many possible types”->Z then I’d just create a protocol with a ->Z definition or a make-z-from kind of thing. I’d put this protocol in the conversion.clj namespace and I’d extend all the types to implement it in that namespace as well, but the records would all be in the same domain.clj namespace.

So what happens is the namespaces just group conceptually similar kind of functionality. They don’t group all functionality related to a particular record. The latter would be what OO does, OO will say, ok group all functions that operates on a given type together. This just seems wrong to me in Clojure. So I don’t put convertions from/to Z in the Z namespace. Instead I put all conversions from any entity to any other in a conversion namespace.

And here was my restructure of their example:

 (ns domain)
 
 (defrecord Juice [color])
 (defrecord Apple [name])

 (ns conversions
   (:require [domain :as d]))
 
 (defprotocol Juiceable
   (make-juice [juiceable]))
 
 (defprotocol Tarteable
   (make-tart [tarteable]))
 
 (extend-protocol Juiceable
   domain.Apple
   (make-juice [_apple] (d/->Juice "red")))

 (ns app
   (:require [domain :as d]
             [conversions :as c]))
 
 (c/make-juice (d/->Apple "Hello"))
 ;;=> #domain.Juice{:color "red"}

Their answer was:

Thanks! I think I’ve seen the “Namespaces group functionality, not abstractions” advice elsewhere, but this made it finally click

So hopefully this helps others as well. And my thought is, probably we shouldn’t have an answer in words only, we need to show example of how to structure things, maybe even better how to go from what they’d do to what they should have done.

PEZ · April 22, 2021, 7:55pm

I’ll take this with me. (Not that I am disregarding the rest, which is great as well, just that this is a bit of an eye opener to me.) Thanks!

ACiep · April 22, 2021, 8:46pm

It was (and I think it still) problem for me from my beginning with Clojure so I’ll try to introduce how my adventure is going.

A little background. I’m full-stack developer and I was programming functionally on front-end for years with React + Redux + Ramda with strict functional rules or Elm where I wasn’t worrying about project structure because it’s very opinionated. While migrating to ClojureScript with Reagent + Re-frame I felt like in home. I knew where to put all my stuff because framework and documentation told me. But when it comes to writing back-end it was sinusoid. I was rewriting and starting over my hobby projects multiple times to understand where problem sits and start over again.

One of the first talk about Clojure I watched was “Demonstration of simplicity that is production ready” by Nir Rubinstein and he said something like “a lot of new Clojure developers start their project with structure from Java projects writing their controllers, models etc. and it’s the worst structure you can have with Clojure”. I interpreted it that we shouldn’t have too many layers (like in Java projects but there it’s acceptable and unfortunately sometimes highly recommended) and instead I limited them to composable HoneySQL functions, utility functions for common problems and routes with handlers that take care of fetching, processing and validating data with private functions. Then I separated routes namespace based on entities. At first it felt like “old, good PHP times but that work” but I quickly realized it have a lot of problems. Queries was tied-up with each others, I had problems with naming functions since all my business logic lived in single namespace. As I think about it now it looked a little like some of my Node.js projects where I use pure Knex (query builder) instead of ORM but not as good.

Then I came upon Polylith and liked an idea. I went through documentation and refactored my existing project to something modeled on Polylith. Without tools, interfaces, symlinks to separated projects etc. to not over-complicate things. I made component’s namespaces separated by entities with store and core namespace inside + database, file-uploads etc. Again at first it worked fine but after some time I felt like I’m doing OOP but with functions instead of classes. I faced many problems because of it like dependency injection and breaking pureness of my functions. I was looking for solution to it but couldn’t find one. Even on real-world repo example I found tests that checks whole impure flow. It wasn’t why I moved to functional programming. On front-end with Redux and Redux Saga for example I can check my whole business logic without making single call to API or mocking data and wanted the same on back-end.

Afterwards with all experience I gained I rewrote my project to mix of these two approaches. I made single store namespace where I only fetch data, domain which is something like M from MVC but more about actions than data - most of my business logic lives here and routes with handlers as simple as possible - just fetch things with functions from store, delegate work to domain functions and based on result build HTTP response, so simple I don’t write handlers as functions; just as compojure routes. Other things like database, middlewares, auth, pub-subs, specs etc. lives in their own namespaces. I feel with it like I found solution to my previous problems. I don’t have name conflicts, I can test my core logic without touching database and just ensure my simple routes fetch correct data. But…

Sometime ago there was discussion about Polylith on this forum and I wanted to take a look on it again with fresh head. I again read documentation, this topic, code of real-world app. Polylith really improved since I worked with it for the first time. Moving to cli(j)-tools was very good decision imo. It doesn’t have a feel of some projects glued together that in one specific scenario will work. Also I improved my skills, learned a lot, finally understood some things and realized I can apply it in someway opinionated but still flexible framework which Polylith surely is. Treat components as abstract set of functionalities and not doing one-to-one relation component to database-table.

Conclusion

I don’t think I have answer to your question but I wanted to sh Hearing “forget about good practices you know” doesn’t help because about which one should we forget?ow example of adventure with Clojure from “newbie” and problems I faced even when coming with some FP background. Maybe it’ll help understand problem more experienced folks here.
I think this (and maybe difficulty of setting correct environment with REPL, Docker and all that stuff) is one of the biggest barrier for people who want to try Clojure. Elixir has Phoenix, Node has not only Express but also Mongoose and other ORMs that are responsible for DB stuff so it’s (big) one less thing to worry about, even Haskell and Rust have frameworks which lets you learn language while also learning framework and building web app. And Clojure is so high-level you’re responsible for almost everything. It’s simple to write things where in other languages you use framework for it that nobody share their work. But “simple is not easy” and it’s really challenging even for experienced devs to setup everything to work well on the first try in new environment not knowing ecosystem. It is important in some areas like trying to convince team to use Clojure. Hearing “forget about good practices you know” doesn’t help because about which one should we forget?
For over a year Clojure was a nice tool for me to play with, but recently I started to discover it’s true power and I feel pain working with JS or Rust again. But it took me definitely too long. It’s problem with all Lisps I think. I can hear “Lisp are so powerful”, “after working with Lisp for some time you’ll see how superior the language is” and things like this. Ok, I see now, but for what cost? Months spent on learning editor, architecture, completely new workflow. Don’t get me wrong. I really like Clojure, it’s ecosystem is my favorite, community is great but marketing(?) behind it isn’t the best to say the least. I was learning Rust when it wasn’t mainstream yet, community was small and it was easy ride comparing to Clojure. Maybe it’s how people want it to be (we have better ecosystem because of that, don’t we?) but Clojure isn’t beginners friendly at all and question from this topic is just the tip of the iceberg.

Phill · April 22, 2021, 9:40pm

A program tells a story. In Clojure, you have freedom to tell the story in whatever way makes it clear.

I got started in C, which is similarly unstructured. And BASIC: a lot of the DEC VAX userspace was written in a large-scale compiled BASIC! C and BASIC are not a great help at quelling complexity, but at least you can write the program in a way that keeps things in perspective. That takes practice. But I don’t think there’s much special to Clojure about it.

Anthony_Leonard · April 22, 2021, 11:40pm

Some (ironically) unstructured thoughts from me…

I agree this is a big deal. Having no designated places for code to live means (literally) no design. I have worked on large codebases, and seen the loss of control of code that happens without structure. Logic is added here and there so you have to follow it all line by line to see what comes out. Seeing this some are quick to blame Clojure itself, concluding its (perceived) immaturity compared to their favourite language inevitably leads to an unstructured mess.

Effective code organisation will always be in the eye of the beholder in parts, but objective goals might include supporting:

Modularity - in the sense of having code modules that do not “require” each other, or layers where inner ones do not “require” outer ones. Breaking up code like this is - I think - always doable e.g. using “clean architecture” style dependency inversion with protocols/multimethods etc, and dependency rejection to keep stateful components at the edge. Making this effort will not guarantee good modules - they may be poor abstractions, horribly interlinked, causing more harm than good - but done well the self imposed structure gives natural testing boundaries and clarity of purpose to each part.

Code Accretion - in the sense that new functionality involves adding code and leaving existing code untouched. There is a strong urge among many to refactor at every turn, which (potentially) breaks code everywhere and is tough to test unless you have very thorough e.g. generative testing, and tough to take if your team-mates did it and you don’t like it . If you can get into a design (again - what goes where) that allows you to implement new stuff by adding new code and even new files only - or implement “change” the same way via “expand and contract” - then you know the old stuff is still there doing what it always did and testing will be far easier. Similarly if some “new-thing” namespace has implementations of 3 existing (inner layer) protocols all in the same new file, that to me is nicer than spreading those implementations across different existing namespaces arranged around the protocols or the layers themselves.

Java folks are using ArchUnit to enforce code organisation and modularity design decisions. I haven’t found an equivalent for Clojure, but I’d love to know about any if they’re out there

simongray · April 23, 2021, 7:04am

I am wondering whether this isn’t basically a reformulation of the age-old “Clojure doesn’t have frameworks, it has libraries”. Most Clojure projects tend to be developed as a library or as a collection of libraries. Libraries are a free-form design exercise while frameworks have little bins where the stuff goes into.

didibus · April 23, 2021, 5:17pm

I think the lack of framework might mean there is a lack of teaching. Often time people use a framework, or want to know how to organize their code, but they don’t know why it is organized as such, and why the way the framework organizes things is good/bad. That serves as a learning point, they first learn that way of organizing things, and eventually learn of its issues as well. That starts to give them an understanding of the pros/cons of that framework. Then they can try another framework, and another, at which point they begin to have their own intuition into code organization. So I agree lack of framework means we don’t have a curriculum to follow for learning about these things, people are thrown into the deep end directly.

I think we lack examples around this area. Honestly, I’m not finding a lot of blog posts, or article about it. And some of the ones that try are too theoritical.

The simplest way to start organizing Clojure code is to have everything in one giant namespace. But I get so much resistance to this from beginners to Clojure. I keep saying, yes, put it all in one big namespace, what’s the problem? And they don’t know the problem, but it feels wrong to them. I say, editors now can easily handle a relatively large file, and it’s much easier to search within one file than across many. Clojure will enforce organization within a namespace, because everything after depends only on what comes before. Put everything in one namespace and don’t use declare.

This will take you really far already. Now if you feel things have gotten simply too big that your editor lags, ok, break it up, how you ask? Just take the half point and move it to another namespace.

Ok, but there’s a few things you need to not do that other languages will have thought you to do. Don’t put all your defs at the top of the file. Put them right above the thing that first uses them. Don’t group functions by anything, simply make sure that functions that use others are relatively close to each other as much as possible.

As you do that, groups of defs and defns will naturally form in your namespace, you’ll start seeing sections forming where the defs and defns are only used within that section, when that happens, put a big header comment around the section and give it a name.

;;;; Permission handling

Now you can just search for ;;;; and you find all sections in the namespace which lets you navigate your code pretty easily.

I think this is a good starting place for most Clojure code base. Yet it seems it’s proven a really hard advice for people to be convince of and actually try it.

huahaiy · April 23, 2021, 5:38pm

I second this advice. I agree fully that a Clojure project should start with a single file, and only split when it becomes obvious.

The most annoying thing for me as a project owner is that some programmers bring their Java habit into my Clojure code base, and create tons of tiny files. It drove me nuts. It is counter-productive.

seancorfield · April 23, 2021, 5:50pm

I think most of the push back on this comes from #1 in @ericnormand’s original post: folks are used to structure from other languages/frameworks and have a deep-seated urge to approach Clojure the same way, from the get-go.

I generally develop with Clojure pretty much the exact same way you’ve described here – but I think the “breaking things apart” aspect is too organic for it to resonate with a lot of folks (I guess that’s an aspect of #3 – they don’t have a good sense, upfront, of how they might want to divide the code up?).

Since I’m very REPL-focused, pretty much any new project work starts out in single file with a minimal ns form and a (comment ..) form. I work inside the RCF (Rich Comment Form), writing and eval’ing code, and lifting pieces out into functions above the RCF as patterns emerge. If an obvious grouping of functions emerges, I’ll shuffle those off to a new namespace once they seem close to “done”. Rinse and repeat. As long as you can choose good names for functions – and for namespaces – future readers (including your future self) should be able to start at the -main function and figure out the flow through the code fairly easily (Zach Tellman’s Elements of Clojure is a great read for advice on naming!).

For context, at work we have about 113K lines of Clojure, 89K is source code and the rest is test code. That source code is split into about 360 files, so that’s an average of about 250 lines per namespace. We have just over 3,500 functions in that source code, so that’s an average of about 25 lines per function. All numbers are inclusive of whitespace, comments, etc. We have just two namespaces that have 100+ functions. We have just two namespaces that have 2,000+ lines and only a dozen that have 1,000+ lines.

ericnormand · April 23, 2021, 7:31pm

Yep, this sounds like what I do. Perhaps a video walking people through this would be good.

didibus · April 23, 2021, 7:41pm

I can’t run stats like that on all our code bases, but I know our biggest namespace clocks in at 2500 LOC.

Most apps tend to have this structure [this is for backend enterprise apps]:

APP
| deps.edn
| config
| test
| src
| > company
| | > app
| | | > utils.clj ;; Random pure utility functions that are useful in this app, think things missing from clojure.core, once they've proven themselves in an app, we might promote them to our shared util library, nothing tied to our model lives here.
| | | > data_model.clj ;; This contains our data specs (using Clojure Spec), can contain records (using defrecord)
| | | > model_helpers.clj ;; Pure functions that operate on our data model, which are useful to have for reuse. The most common things in here are constructor functions like make-entity, mapping functions from one entity to another, serialization functions to/from for our model (like to JSON and back), validation functions for our model, and things like that.
| | | > globals.clj ;; Global shared state goes here, mostly top level defs, think your loaded config map for the current environment, your instantiated AWS clients, your DB connection pool, your logger/metric publishing service, and if you need too your APP state, like a datascript instance, or an H2, or just a shared atom map, etc. Functions to initialize and destroy all these also go here. Since all constructs to hold state in Clojure is already managed, you don't need any special "encapsulation" for them, so there won't be any `set-player` or `update-gold` kind of thing in here, only the `defs` and functions to support their initialization/destruction.
| | | > service_foo.clj ;; Business logic goes here, those are the operations the user/business wants the APP to support, thus operations over the data model go here. This will contain a lot of impure functions and is an entry point. An app will have these only if `app.clj` has become way too big, normally all operations live in `app.clj`, and are only broken down for very very large apps. In this case foo would be the operation, and service_foo.clj is where it will be handled end to end.
| | | > service_bar.clj ;; Same as service_foo.clj, but for the bar user operation. Keep in mind operations here are processes and activities needed to manage the data and domain under this app's context in order to fulfill the top level business functions required by your users. This is a service in the DDD sense. Best to think of `foo` and `bar` as the APIs you expose to your clients. Whenever you have operations that are pure over the model, they can go in model_helpers and services would use the helper to help them implement the user operation, that's how you reuse things that multiple operation might need to do to your model.
| | | > service_helpers.clj ;; When multiple services start to show chunks of logic which are the same between them, you can move it here and have them all depend on this shared logic. Make sure you inject dependencies into those helpers, though they are allowed to do side-effect.
| | | app.clj ;; This is a giant namespace, all apps start out with only this and all the namespaces above are extracted out from this one when the need arrise only, otherwise they'd just all live as sections into this one namespace.

Most apps aren’t large enough to have all these, so they vary from only an app.clj to the full set of the above namespaces. I use service oriented architecture normally, so each app would be scoped to a given bounded context of the overall business domain, and so the data model would be within that context, as well as all operations would be scoped to that same context. And so for end-to-end top level business functionality, often you’d have one app call another to fulfill the full business need, as many business needs require collaboration between multiple domain contexts.

If you’re curious where do I put the code to fetch data from a DB/externalService or store it, well that goes in app.clj at first, moves to service_foo.clj if things grow large enough, and finally moves to service_helpers.clj when service_bar.clj has the same data fetch/storage needs. But its often mostly all in app.clj like so:

(def db
  {:dbtype ...
   :dbname ...
   :host ...
   :port ...
   :user ...
   :password ...})

(def default-query-options
  {...})

(defn get-user [db query-options user-id]
  (-> (jdbc/query db ["..." user-id] query-options)
      ;; code that converts the DB result into a valid user from my data_model
      (model-helpers/validate-user))

(defn show-user
  [request-map]
  (wrap-metrics
    (try
      (validate-show-user-request request-map)
      (->> (:user-id request-map)
           (get-user db default-query-options)
          ;; Code to convert it to a valid response)
      (catch Exception ex
        ;; log
        ;; handle
        ;; return valid error response))))

Like things would start this simply, and evolve from there.

nbardiuk · April 24, 2021, 9:32am

I also thought about this topic and I cannot find anything true and universal in how I organize the code. It just comes from experience of working in different projects and noticing when code feels comfortable and uncomfortable.
I’ve also noticed that people I pair with have different sense of comfort when they work with a codebase. And even when we agree that it is uncomfortable we often do not agree on how to make it better.
I am curious how to learn/teach design. I’ve picked up some maxims from other people but I do not necessarily understand why they work and in what context they don’t work untill I try. Is there a better way to know than just trying?

maxweber · April 24, 2021, 12:24pm

We tend to write very small namespaces:

(ns app.video-upload.check
  "Checks if an uploaded video can be processed by 
   ffmpeg. Thereby non-supported videos can be 
   rejected in the client app."
   (:require ...))

  ;; I've omitted the rest of the code for this example.

We use the doc-string of the ns form to describe the concept of the namespace, meaning “what problem does it solve” and “why does it solve it” (in the context of the overall system). The real namespace used for this example only contains 2 functions.

The comments in the code then describe how it is done. Rule of thumb: If anything is not covered by the concept description, it should be probably moved into a separate namespace with dedicated concept description.

We also do not use any deeper folder structures than this one. Instead of using a product-specific top-level folder we use app.

mars0i · April 24, 2021, 5:17pm

This discussion has been very interesting to me. I want to add one thought into the mix for consideration (plus a little side comment), but I can’t tell anyone in this discussion how to organize Clojure code: I don’t work with large Clojure codebases, I’ve never worked with Clojure on a team, I’ve probably written less Clojure code than the experienced people here, and my projects have different goals than most Clojure projects. I know enough to understand that I’m not dealing with the same challenges that others are. Here’s one thought, though.

I usually start with no declare, but sometimes, after adding a lot of definitions, I come back to a file and think, “Wait, why is this function here?”, or “What is all this code doing?” I know what the file (or section of the file–doesn’t matter) is supposed to do, but I don’t know how the pieces of my code serve that purpose. I scroll down and ultimately find a crucial, primary function (or functions) that need(s) the other functions, and from there I can trace back to find the rationale for each of the other functions. At that point, I often add a declare at the top, and move the central organizing function that was at the bottom of the file up to the top of the file or the section: if I start reading from that function, it gives me an overview of what the code below it is is doing, and then I can scroll down to see how different pieces of that operation work, and I know why they’re there. So putting a function that gives an overall purpose near the top (or near the top of a section of code) is a kind of documentation for me. Sometimes that tells the story better.

Sometimes I reorganize functions in other ways within a file for the same purpose, and this can require declare as well.

A side comment:

I like this point (though I don’t have any thoughts about what to do). It reminds me of something else in my work: I have students, and I assign papers. The assignments give students detailed though somewhat flexible advice about how to structure their papers. Not all of the students need the advice, but some do. Those who are already excellent writers and know what to do with the subject matter can ignore my guidelines and get a top grade. The guidelines are there for students who don’t have any idea about how to go about writing a paper of the type I’m assigning. It gives them a starting point for thinking about what’s needed in that kind of paper, and adds to their toolbox of writing strategies that they can use after the class is over.

jan · April 24, 2021, 7:12pm

Eric,

Thanks for bringing this up. I’ve been mulling over the topic of organising Clojure applications on and off for a couple of years now. I tried various things but most valuable insights came from watching other people’s work.

A gem that I keep coming back to is Rafal Dittwald’s “Solving Problems the Clojure Way.” While not strictly about architecture, the journey Rafal takes us on provides valuable insight into how Clojure applications can be structured. The best thing is, he doesn’t show a single line of Clojure. It’s JavaScript all along. It does wonders for the reach of his talk.

An experience report that helped me think about structuring applications is Jarppe Länsiö’s discussion on long-lived projects from ClojuTRE 2018: “First 6 years of a life of Clojure project.” The way their architecture evolved was instructive. Modelling commands as data and pure functions reminds me of domain-specific languages, recently discussed by Hanson and Sussman. I loved the gag about mocks.

Domain-driven design that Sean mentioned is a box packed with valuable tools. The building blocks, such as value objects or entities, translate without much friction into the Clojure thinking about value, identity, and state. Aggregates help find consistency boundaries. Paying attention to the language used in our domain helps identifying bounded contexts and drawing namespaces around them. Thinking domain-first can help with many design hurdles.

“Domain Modeling Made Functional” is a good book that introduced the domain-driven approach using F♯. And while we’re at other functional programming languages, “Designing for Scalability with Erlang/OTP” is a treasure trove, in particular when it comes to handling failure.

I’ve been experimenting with those and other ideas, building applications out of them, sometimes giving talks to discuss the outcomes. The majority of resulting slide decks aged poorly. I suppose I could give a whole new talk that’d be all about pointing out weaknesses in my earlier takes.

Not long ago I contributed to Spacy, my workmate’s web application for planning remote open space events. The main value proposition of that app is its focus on being responsible; an achievement for which I can take no credit. As the main author, Joy Heron, worked on the front-end architecture, I experimented with the back-end.

The result is a loose combination of an onion architecture with a dash of functional core and imperative shell. Inside you’ll find a “domain” namespace that defines schema of our open space events and exposes pure functions transforming entities from one valid state to another. It focusses on the essential domain complexity; technical jargon like “SQL”, “HTTP”, or “JSON” does not belong there.

The neighbouring namespaces require the domain one and provide all the important moving pieces. Databases, ring handlers, and lifecycle of components are external to the problem domain. If I change the database or replace the HTTP interface with a Kafka client, the domain remains untouched. That’s what I want, because over the course of a typical application’s life there’ll be enough business reasons to change the domain logic.

Now we can ask what do we do if we need to add new functionality that has little to do with the existing logic. Say, a chat between participants. Where would we introduce it? Depending on the context, “elsewhere” might be a good answer.

Mind you, I don’t mean microservices. A valid elsewhere would be another tree of namespaces in the same project. I’d keep them isolated from the rest to minimise coupling. (Isolation checks could run in CI, with an ArchUnit-esque tool that Anthony brought up.) No matter what change you introduce in the chat subsystem, it should not break the event planning one.

Having those trees structured in a similar fashion would make the entire app easier to understand. But what’s even more important is exposing a consistent API across all the domains. Say, a Ring handler. That enables composability, another aspect stressed in the aforementioned book.

Composability is a quality I look for when assessing software design. Can I take the entire Spacy and expose it under a URL path prefix next to another Ring application? How much effort would it take, how much code would have to change? Better yet: can I start two instances of the same application in the same JVM behind a single Ring adapter? Implicit dependencies and global mutable state will quickly surface.

The problem with a back-end example like Spacy is its insignificant size. The application has but a handful of use cases, operates in a small domain, and doesn’t need to be fault tolerant or high scalable. It falls into the O(armchair) complexity class. Things start to get hand-wavy when we start talking about database transactions or error handling.

That being said, I think that’s the rough structure I’d start a new project with. The main reasons are composability and malleability. We don’t have to get it spot on the first time as long as we can adapt and change it in the future.

Looking forward to your comments.

didibus · April 24, 2021, 7:22pm

That’s interesting, I think you show that there’s different options to organizing code, and maybe most options work pretty well in the end.

I do think Clojure namespaces are purely organizational of the source code, unlike a Class, they don’t create any semantic structure, at the end of the day, you’re just using functions, where you put them is a matter of how it make most sense for you to categorize and recall where the function you look for is.

The only challenge structurally with breaking things up in more namespace is the dependency management between them and avoiding circular dependencies.

But I think this is a bit of a key point. The namespace is not something that holds state, or that encapsulate data, or that creates a hierarchy of types, etc. It’s just a folder. So similarly to a music library of mp3 files, what folder structure you want to have, put them all in a music folder, or have a folder per artist and another per album, etc. is very personal about just helping you navigate and find your way through your music library. I think namespaces are similar.

What matters more is the semantic structuring of your code, such as keeping your functions pure, managing your app state properly, keeping your stack calls shallow, keeping your side effects on the outskirts, writing well factored functions, etc.

Where you put those functions does have some pros/cons, but much less then the above, and is more easily refactored.

I think the considerations of where to put the defs and defns would be:

Ramp up time of someone new to the code base to get a mental map of the various piece of functionality
Ability to reuse functions and share vars where needed
Dependency relationships which can’t be circular
Being able to know the scope of use in case you need to refactor things in a breaking way, so you can easily find all references to what you have and know the impact your refactor might have.
When reusing, what granularity of code you can pull in

I know what you mean, but what I realized is with declare you rely on good intentions to make sure the story is told in the right order, and that summaries are always going to appear first. And that’s not always true, and good intentions are hard to keep up consistently. If instead you just get used to reading Clojure code bottom up, you’ll get the same benefit, the summary is always at the bottom and you read up to see the logical chain of events, at least that way this order is guaranteed, so it’s much more reliable, well unless someone used declare to mess up the ordering.