Clojure's keyword namespacing convention Considered Harmful

https://vvvvalvalval.github.io/posts/clojure-key-namespacing-convention-considered-harmful.html

1 Like

Don’t do that. Don’t treat Clojure keywords as composite data structures. This is accidental complexity waiting to happen. Programmatic names are meant for humans to read, not for programs to interpret. Changing an attribute name should not be able to change the behaviour of your program. In Hickeyian terms: you’d be complecting naming with structure.

I include this into my personal list of Clojure don’ts. I’ve commited this mistake once or twice for the sake of conciseness (e.g., :a/b instead of [:a :b] or {:something :a :other-thing :b}) only to regret later for one reason or another.

Speaking of portability, do you have particular opinions about keywords vs strings as constants? I’ve recently converted all constants from keywords to strings (e.g, converted the values of :user/language from :en-US to "en-US"). By this change, I could eliminate all the code that keywordizes strings when I get them from browsers or stringifies keywords when I write them to database. Furthermore, it gave me a system-wide invariance that keywords never show up in entity values. The only downside is that strings don’t implement IFn unlike keywords, but this seems to me such a small price for big accidential complexity.

It’s interesting, as this is often proposed as a means to avoid deeply-nested structures…

I’m afraid you misunderstood me. I didn’t talk about the usage of namespaced keywords as map keys. I’m simply opposed to expressing arbitrary data of two components as a single namespaced keyword.

When we use :user/id to express the user’s id, it’s a pointer to one definite thing. We could decompose it to the namespace and name parts, but we can consider it as one concept.

What I’d discourage is to use :en/US to mean “english used in the United States”. This is bad because you compress two separate ideas (language and region) into one thing. In such a data form, one cannot express “english (standard)” or “english spoken by Hispanics in the United States.” If we instead treat each idea separately, it’s a business just as normal as writing maps {:language "en"} and {:language "en" :speakers "Hispanics" :region "US"}.

1 Like

Ok, I got you now…

“But I can just write a key-translation layer at the edge of my Clojure program…”
… and then you’d lose the main benefit of namespacing, which is the ability to track a data attribute across your entire system rather than just one component of it.

  1. Sometimes there are other reasons for translating key names, like storage size. For example, longer key names lead to bigger storage size for for JSON (string) columns in SQL Server and even JSONB columns in Postgres.
  2. Translating keys does not always mean you lose the ability to track it across your system. Sometimes it just means you search via a regex (for example, accepting either _ or /.)

Don’t do that. Don’t treat Clojure keywords as composite data structures. …

As a basic example of how this may break, consider that it’s normal and expected to find in the same entity keys with different namespaces, e.g :person/first-name and :myapp.user/singup-date .

Why would checking the name/namespace of keys in the above map necessarily cause problems?

I commented on Val’s blog post, but I’ll repeat it here:

If you have a name like :customer_invoice_id you cannot tell whether it “belongs” to a customer as :customer/invoice_id or to an invoice as :customer_invoice/id and that is a useful distinction to have in your code. It’s why next.jdbc defaults to :table_name/column_name for keywords in result set data.

But I do agree that slavishly translating snake_case to kebab-case and back is not worthwhile if it adds no semantic value, i.e., don’t change :table_name/column_name to :table-name/column-name for no reason other than aesthetics.

It’s interesting to compare the default Clojure to JSON behavior in Cheshire and clojure.data.json:

;; Cheshire default
user=> (json/generate-string {:a/b 42 :c/d "Sean"})
"{\"a/b\":42,\"c/d\":\"Sean\"}"
;; clojure.data.json default
user=> (j/write-str {:a/b 42 :c/d "Sean"})
"{\"b\":42,\"d\":\"Sean\"}"
;; Cheshire with just the name portion
user=> (json/generate-string {:a/b 42 :c/d "Sean"} {:key-fn name})
"{\"b\":42,\"d\":\"Sean\"}"
;; clojure.data.json with the qualified name
user=> (j/write-str {:a/b 42 :c/d "Sean"} :key-fn #(subs (str %) 1))
"{\"a\\/b\":42,\"c\\/d\":\"Sean\"}"
user=> 

I prefer clojure.data.json's default behavior (and notice that it also explicitly escapes / for JSON whereas Cheshire does not).

2 Likes

This problem can be avoided by adding one more underscore between the namespace and the name. E.g., :customer_invoice__id.

As I said in a comment on Val’s blog, that is a horrible suggestion, in my opinion. Scanning code and expecting people to visually distinguish _ from __ in the middle of identifiers is a very error prone way to avoid/workaround a core language feature that Clojure has had for years that has a very intentional and well-designed usage.

7 Likes

I can easily conceive of situations where it would be useful to do something like…

(filter #(= (namespace %) "foo") some-sequence-of-namespaced-keywords)

Those situations would be made much more painful and error prone if my predicate function had to wrestle with parsing between the underscores in strings.

It’s also not clear to me how using namespaced keywords degrades the readability of my programmatic names.

3 Likes

Thought I’d link to the last time this came up on here as noone else had:

https://clojureverse.org/t/should-we-really-use-clojures-syntax-for-namespaced-keys

For me I think there is a lot of power in namespacing keys, and in keeping the names the same everywhere (e.g. as defined in a data dictionary or schema elsewhere, and allowing quick tracing of an attribute through code and logs.) But there is also power in having syntactic support in the language to shorten those names where (in context) there’s no ambiguity. So no need to be too strong on this. “sales_customerInvoice_id” in your DB and JSON APIs can be unambiguously translated to :sales.customer-invoice/id in Clojure I think, with little loss of meaning? In the end it’s what it’s called in the APIs and DB that counts.

Tend to agree that the name is just a longer structured name and the “first part” should not map to entities. But I am a (long-toothed) student of such matters.

1 Like

Then again if your DB and API wire formats don’t support this, that could be seen as a failing of your DBs and API wire formats :wink: I wonder whether the term “lisp-case” is right? Perhaps “edn-case” or just “edn standard for namespaced keys” is better?

I guess accepting a lowest common denominator convention that trades universailty for loss of well understood structural semantics and related syntactic power is not an advance in general? The syntactic power to shorten where there’s no ambiguity is important I think. After all using fully qualified names all the time in real life would get pretty vexing! Only my Mother addresses me that way (occasionally.) Similarly try getting business users (and developers) to stop using the quick off-hand term they’ve used for years in their (bounded) context about a thing and start fully qualifying everywhere - including stories and documentation. They’re really not going to do it except for in those specific conversations where they’ll start crossing boundaries and there is an ambiguity,

Sorry for the ramble. I think it is an interesting topic. “Naming things …” It comes down to language itself, and how to express idiomatically in context while being globally precise in the words we use and their semantics if we can.

1 Like

I think the usual name for this casing is kebab-case. :slightly_smiling_face::oden: