I have a question that you may be able to help me with. I will give a bit of background, ask the question, then give some of my thoughts on the answer. I hope it starts a discussion as I think it is an important topic.
I am often asked, “How do you organize Clojure code?” They say they’re writing some code and it’s starting to get hard to manage. I don’t think I have a good answer. I want to say:
“I start with one file. As it gets too big, I try to break it up into cohesive namespaces.”
Somehow that doesn’t satisfy the asker. But I don’t really have any better answer. It works for me.
But I get asked this question often enough that it seems to be a real problem for people. There might be a helpful course in it.
So, the question is: What is the answer to organizing code in Clojure?
I would love your answers, either arm-chair or from direct experience.
I’ve thought about this question a lot. I’d like to share some of the ideas I’ve explored. None of them, to me, seem to be the real solution. I am very open to new potential solutions and critiques of these ideas.
A lot of people learned to build applications within a large framework such as Rails. In a Rails app, you use Model-View-Controller (MVC) to build your application. What MVC means in practice is that you characterize parts of your code as either M, V, or C, and then you always know where to put it.
In contrast, when coming to Clojure, there is no such given structure. In fact, you just write a
-main function and write whatever you want. Are they uncomfortable with the amount of freedom? Are we, unbeknownst to us, using some structure that has just not been made explicit? I get the impression that learning MVC was an epiphany and they want a similar epiphany in Clojure.
After being asked how to structure Clojure code, I have on occasion asked to look at their project. Invariably, when I open the file, I can immediately see 2x-5x code size improvements. They’re missing some very fundamental ideas, like using
reduce chains. They’re not using the standard library.
Another common problem is “over modeling”. What could be a simple “move data from this table format to that table format” is over-engineered. Entities are defined, with corresponding operations, that are totally unnecessary to the task.
In short, if I had written it, it might be 100 lines, in one file. There wouldn’t be an organizational problem. I’m not trying to brag, but it explains why I can’t see the problem. They see it as “well, my code is a mess, I should structure it better” and I see it as “use the standard library, powerful idioms, and only model when it’s useful”.
How can we get them to see how to improve their code instead of trying to “structure” it? Does this also suggest a Clojure rule of thumb: If you are worried about structure, check that you’ve made it as concise as possible?
This one seems the most promising to me as a real solution, yet it basically means I would have to tell them “get better and it will not be a problem”. Not very satisfying? Perhaps better: “I often rewrite my code several times in the initial phases. Re-write it again, but look for opportunities to use
reduce, and other standard functions.”
In my “natural” answer, I said, “break the code up into cohesive namespaces.” Cohesive is an advanced topic. Maybe finding where to break it up is where they’re having trouble. I have spent a lot of time breaking up code in different places. I’ve built an intuitive sense of places that seem to work for me. They haven’t gone through that, so the word cohesive means very little to them.
Also, if they’re coming from an OO background, they’re probably used to cohesion around a set of data fields, not around purpose, as we might do in Clojure. For instance, I tend to put all the code dealing with the WordPress API in one namespace. But they’re used to breaking up the Author and Article classes into different files, and spreading the WordPress stuff to each of those “entities”.
Could this be the topic of a tutorial? How to break up a 500-line program into namespaces? What does cohesive mean to a Clojurer?
Java programmers are very constrained in where a method can live. Even if Java is super verbose, you know that the
getAuthor() method must be in the
Article class’s file. In Clojure, namespaces are just bags of functions that could operate on any kind of data. In fact, you could arbitrarily divide a namespace into two namespaces and things would still work. Are Java programmers so used to this constraint that they cling to it, even if it doesn’t really confer benefits?
It could be that the problem is me: I’ve never worked on a large system where the structure was a dominant concern. I’ve worked mostly in startups, with small codebases, and on my own small projects. Could it be that Clojurists who do work on large systems do have this problem? Do they have answers?
When a codebase is small, it doesn’t need much structure. But as it grows, it does. As we start to add this structure, how do we know that the structure will scale? Is it the correct structure? Perhaps the askers of the question are anxious about whether the structure last. Will they have to re-structure it soon? Will that be hard?
I, on the other hand, think of scaling as a kind of phase shift. Moving to a higher scale necessitates a reorganization. Clojure lends itself to easy reorganization. But maybe they don’t know that. Maybe they’re worried about structuring themselves into a corner when that’s not really a problem.
I hope this starts a big discussion. I’ve thought long enough about it. I realized that I probably should have started this discussion sooner, to get more varied input. What do you think? Have you been asked this question?