Sorry this post is so long … thanks to everyone who actually reads it!
In another thread, @seancorfield mentioned that there’s a sentiment around these days that one should prefer using maps over records. I expressed surprise, and @didibus subsequently wrote a series of posts (starting with this one:
What is 2021 recommendation for Specs? - #19 by didibus) that explained why when one is feeding data in and out to/from various sources, it’s better to use maps so that it’s easy to adjust to changes in those data sources. You can use a :type
key (or a key with some other name) in each map with particular keyword values to specify the expected fields, and you can then test that the data is in the right form at specified interface points using spec so that it will fail early rather than biting you later. These posts were great! (I’m leaving out a lot of valuable details.)
I did, and still say that this all makes sense to me for the kind of data processing that didibus was talking about. And I still am uncomfortable with the advice that one should default to maps over records. I think records are perfectly good, and that they should be part of the “there’s more than one way to do things” that any good language such as Clojure allows. I think that in fact, moving between maps and records is often easy, so I’m not that worried that people who would benefit from records will lose that benefit if they’re taught to use maps instead: they can easily switch when the see a benefit to it. I do have a small worry that I’ll express at the end of this post.
Something was bugging me about the “default to maps” advice, and I think I figured out what it is. I offer this in the spirit of clarification, as much to let others help me understand better as for me to (maybe) help others understand how I’m seeing things. It may be that there are things that I am just misunderstanding.
First, I think that in the kind of scenarios that didibus described, it absolutely makes sense to me to use maps and spec as didibus indicated. Yes.
However, in my work, and in many contexts in which Clojure is, or could be used, I believe worrying about changing data sources/sinks is not a big issue. I’ve used spec once, in order to learn about it, for data validation in a Clojurescript form. I could probably use it more than I have, but not using it hasn’t caused much trouble. Using spec in my code would actually be more trouble. I don’t write as much code as some folks here, but still, I don’t think that maps by default + spec is optimal as a general rule, and I don’t think that advice that makes sense for a particular kind of application ought to be considered general advice. Of course, you could still use maps all of the time, even if you didn’t think it was worth checking your data with spec.
In the other thread, I explained that I liked records because they partially documented the data structure that I expect. Here’s another part of what bothers me about the map-default strategy–when it’s not called for by long-term data management needs. If I define a record and then mistype its name, the compiler will catch it. If I use maps with a :type
key, and I mistype a keyword value, or even mistype :type
, the compiler won’t care. Of course, because Clojure is dynamically typed, there are lots of things the compiler doesn’t catch, and you just have to know that and deal with it. But if I’m using maps with :type
keys instead of records, spec becomes much more important. It’s essentially doing what the Clojure compiler does with records. (Spec can do a lot more; I’m just talking about validating type keyword values in maps.) If you’re already spec’ing your data at carefully chosen points, going from records to maps might not be a big deal–and then you get the flexiblity that didibus described.
But if I don’t have to deal with spec, and I don’t have to worry about changing data sources/sinks, then using maps with a :type
key feels very low-level. It means I’m constructing types with no help from the compiler, and then I have to do my own type checking using spec. Clojure is then functioning as a lower-level language than it could be. I’m not letting the language do the work, and I’m making my work harder, rather than easier.
I’m in favor of Clojure appealing to a broad audience. If new users are taught that “this is how you do it in Clojure” until you have advanced knowledge, and that way of doing things is more difficult, less elegant, more involved, and more bug-prone for their applications than alternative strategies that are discouraged, then those users might be less likely to continue with Clojure. If someone doesn’t have the kind of data management and validation needs that benefit from the default map + spec strategy, they might feel that Clojure is a little less appealing, and go to another language (e.g. Python). So I’m in favor of new users learning about the map + spec strategy (which I didn’t know about until the past week) but I’m not in favor of them being told that that’s the way everything should be handled. I don’t think it’s a big deal either way, but that’s what I’m thinking.
Appendix:
Maybe the reason that it’s good to use what I described as “lower-level” strategies with the kind of data management context that didibus clarified is that it’s a context where a lower level matters. If the structure of your data can change, then you in effect have to deal with a lower level; you can’t just specify the data structures once and for all, and then forget about the details. You have to build in flexibility so that you can respond to internal changes in structure. Not sure if this is the right way to put things.