CLJ Commons: Building a formatter like gofmt for Clojure

@kkinnear To be clear, I find zprint a very valuable tool. Thank you so much for that.

About clojure.test/are formatting, it should ideally depend on the first argument. That’s a template for the rest of the form. A couple of examples:

(are [x y] (= x y)  
  2 (+ 1 1)
  4 (* 2 2))

(are [x y z] (= x y z)  
  2 (+ 1 1) (- 4 2)
  4 (* 2 2) (/ 8 2))

Notice how in the second example there should be 3 forms on each line if possible because the first argument of are is a vector with 3 elements.

Thanks. Isn’t that a pain. Not only does the format vary, but I have nothing that does triples or more of things. Pairs, sure. Other counts, no. Perhaps the best sort term approach would be to somehow leave the (are ...) the same – just skip formatting them altogether. Though even that is an enhancement. I’ll give this some thought. Thanks for the insight.

Some thoughts from an old-time Lisper who sometimes gets frustrated by what he sees as people ignoring established Lisp-y practice…

These are important for me:

I think the above things would be massively useful, and I believe, or at least hope, that most people would be happy with them.

Compared to other Lisps, there’s one thing that often causes me problems: the indentation of let forms, cond forms, etc, where the syntax involves pairs of things that are not bracketed. When the second item of a pair starts on a newline, I’d like a little extra indentation (probably two spaces). IMO that would be a significant improvement on what I’m used to, and I hope it would be something most people would be happy with.

For any rules beyond that, there are probably always going to be times I would want to break them.

I do think it would be useful to have options for some things, especially when working on projects where there may be a lack of understanding of or consensus on good style.

I wonder if it would be possible to have options that could be chosen independently, but to also have a set of defined formatting “levels” that progressively define stricter and more contentious sets of options. Then people could choose different levels depending on the needs of different projects. I’d like to be able to say “we’ll go for level-N formatting” for this project rather than fighting over lots of options. There would probably be a big fight over how many levels to have and what goes into what level, but that would be a big fight done just once when designing this new formatter rather than a big fight for every new Clojure adopter or project.

EDIT:
Removed mention of line breaks, because it’s more general than that. I think by default whitespace shouldn’t be changed except to change indentation and, perhaps, to remove trailing whitespace and empty lines at the end of a file. That’s because I think it’s sometimes good to use whitespace in non-standard ways to improve readability. There could be options to do further things with whitespace.

3 Likes

Most of forms that I skip are of two categories. The first one is triples and in general those are test helpers like juxt/iota or metosin/testit and the other one is large forms that take a while to format. I think that triples have enough usage to be considered on the formatting options but they need to be combined with arg1, and arg2 at least.

Interesting that you mentioned that. I have exactly the same issue. I suggested the idea of some indentation for these cases a long time ago for the community standards — and the response was that people wouldn’t be happy with that. There are a variety of ways that people get around this confusion — blank lines being the most common, though some kind of indentation is also pretty common.

This issue was the original reason that I modified several code formatters and that led me to ultimately write zprint. zprint does exactly what you suggest by default. It can be turned off, of course. Given your statements about changing line-breaks, I suspect zprint isn’t going to be your tool of choice since it totally ignores existing line breaks, but it does indent the second of a pair of things if they would otherwise end up starting in the same column.

1 Like

I suggested the idea of some indentation for these cases a long time ago for the community standards — and the response was that people wouldn’t be happy with that. There are a variety of ways that people get around this confusion — blank lines being the most common, though some kind of indentation is also pretty common.

Interesting thought that people wouldn’t be happy with it. I can’t really see any downsides. Oh well.

I tend to use blank lines or a “;;” on a line by itself to separate things, but I don’t like either of these.

1 Like

I collected my thoughts here: http://tonsky.me/blog/clojurefmt/. I also propose a simpler formatting rules that do not rely on any custom forms formatting.

6 Likes

I like your suggestion! I think I’ll start writing multiline and/or all on different lines if your suggested style will be adopted, should look nice:

(or 
  (dog? x)
  (cat? x))
1 Like

May I also suggest removing extra spaces between forms on one line?

(+  1 2) 
;; becomes
(+ 1 2)

{:short              1
 :very-very-long-key 2}
;; becomes
{:short 1
 :very-very-long-key 2}

1 Like

Nah. I like to format let bindings and map keys this way. Easier to read

1 Like

It sounds totally wild and crazy to me at first, but having thought about it shortly it starts to make sense. I think I’ll try to make an experimental formatter for VS Code that works like this and just see what it feels like using it. (My experience with the indent syntax of VS Code is that it is not very useful for this task, but I might be wrong.)

Agree! With the addition that I think that any rules for folding the paren trail and deleting newlines should be relaxed while you type and can be applied more strictly by an explicit format command and on save.

However, your article does not even mention the paren trail nor empty lines. Does that mean you think that should not be part of the job description for clojurefmt?

Have you used the Format and Align Current Form command in Calva? (It’s experimental, but often does the right thing, IMO, (and when it doesn’t it is just an undo away to restore things.)

1 Like

Phew… Thank you!

Please let me know! I’ll switch to it immediately.

I have no preference about this. Relaxed on edit and corrected on commit sounds good to me.

It does indeed.

1 Like

Does this mean one space for vectors and maps and two for sets, or is it something else?

yes

(this is just a padding because “post must be at least 20 characters”)

And this is why formatters eventually have configuration.

Another point to think about. I find strange that no one has challenged the premise that a formatter like gofmt is really usefull.

I’d argue it isn’t. I think what people want is unified formatting within a code base. And for the formatting rules to be taken care of for them.

At least that’s what I’d want. And that’s the appeal of gofmt for me.

If there was a formatter which auto-updated itself with ever more support for new macros and forms, and which was easy to enforce at a repo wide level. And which also had a .format file which could allow override rules and to add rules for new macros and functions. And which would be fast and never ever fail. I think it would provide all benefits.

The rules could still be simple, and I think they should. And I like tonsky’s rules as a starting point. In that we need rules that never fail, and it’s nice to also keep close to what your editor normally indents newlines as. But I think it could add more readable versions of certain macros, and perform newline and spacing normalization, etc. And also provide configuration. I’m not sure what would be the downsides to that.

1 Like

The easiest way to resolve that debate is to create a zero-config formatter and see if people use it. The argument in favor of zero-config formatters is that the specifics of formatting are less important than consistency. If that assessment is correct (and it seems to be in the case of go and javascript), then people will use it regardless of whatever specific decisions it makes.

5 Likes

As a Clojure developer, I’m deeply interested in having a standard formatting tool similar to gofmt. I think that it would make unfamiliar code more accessible, and provide a better on-ramp for new users.

2 Likes

Actually, we need a formatter for blogs.

Every single blog post in the world must look exactly the same, for the greater good.

The point of making every blog look precisely the same is not to produce a style that everyone would love. That’s impossible. The point is to enforce a style that will free people from arguing and making style decisions at all.

Want to use that sexy blue background colour you’ve chosen for your blog? Nonono. It’s driving me mad with its filthy inconsistency. Chaos. Mayhem. This cannot stand.

Want to have a line-height of 1.3 instead of 1.4? Heresy! You must abandon your personal aesthetic preferences and submit to The Collective, for The Greater Good.

You will feel uncomfortable in the beginning, but pretty soon you’ll notice you’re no longer burdened by having a will of your own, and the transition will be complete.

Should your sinful desires ever stir again, The Collective will place you in a rehabilitation center not too far from home, where you’ll learn to embrace The Greater Good once more.

2 Likes

Note: I updated the rules a bit to address Parinfer breakage

2 Likes

Looks much better than I expected from such simple rules. I would like to try them out against some of my own code.

The only obvious “problem” or missing feature in your proposal is that it only works when the newlines are already in the correct place for seqs and vectors. This may not be a problem when you’re dealing with human-written code, but If we want a formatter that would take arbitrary code as data and output readable text it would need to put newlines in the correct spots which AFAICS requires more complicated rules.