Should we really use Clojure's syntax for namespaced keys?


#21

My knowledge of Spec is still a bit superficial, but I think this is what :req-un and :opt-un are for?

You can definitely check unqualified keywords using spec. But namespaced qualified keys when used in maps get a special treatement, as their spec is globally enforced once registered. Consider the following example:

(require '[clojure.spec.alpha :as s])
(require '[clojure.spec.test.alpha :as stest])

(s/def :mc.cust/first-name string?)
(s/def :mc.contact/email string?)

(defn greet [customer]
  (str "Hello, " (:mc.cust/first-name customer)))

(s/fdef greet
        :args (s/cat :customer (s/keys :req [:mc.cust/first-name])))

(stest/instrument `greet)


(greet {:mc.cust/first-name "Bob" :mc.contact/email {}})
;; Throws, the following:
ExceptionInfo Call to #'user/greet did not conform to spec:
In: [0 :mc.contact/email] val: {} fails spec: :mc.contact/email at: [:args :customer :mc.contact/email] predicate: string?
clojure.core/ex-info (core.clj:4739)

Our spec’ed function only defined :mc.cust/first-name, and really it is the only thing it cares about. But because :mc.cust/email is a fully qualified keyword in the standard sense, spec enforces that if it is present in the map argument, it has to be a string.

I think it’s a really smart way of keeping dynamicity in the system. You do not have to provide for each function spec all the :opt-un [...] for every possible piece of information that might flow through your system. But you still maintain a strong integrity of the value attached to your global names, and hopefully you get an error message closer to where you corrupted your data.

As I said before, maybe it’s just a matter of convincing spec to treat “all-terrain” keywords as fully-qualified to regain that property.


#22

@chpill In your opinion, what do I lose if I transform your example to the following?

(require '[clojure.spec.alpha :as s])
(require '[clojure.spec.test.alpha :as stest])
(require '[my-company.specs :as msp])

(s/def ::msp/mc_cust_first_name string?)
(s/def ::msp/mc_cust_email string?)

(defn greet [customer]
  (str "Hello, " (:mc_cust_first_name customer)))

(s/fdef greet
        :args (s/cat :customer (s/keys :req-un [::msp/mc_cust_first_name])))

#23

@vvvvalvalval calling (greet {:mc_cust_first_name "Bob" :mc_cust_email 42}) won’t raise any error here. We would need to provide an :opt-un [::msp/mc_cust_email] to detect an issue early.

The example here isn’t very interesting because the function does not return a map, but we write and compose functions that assoc, dissoc, update on maps all the time. They only deal with a subset of possible keys , and do not assume much about the rest. That bring us great composability, and also a great clarity thanks to the -> thread first macro.

Being able to continue using that style while enforcing globally that pieces of information are valid wherever they may be is what spec is about I think. you lose that by not using classical namespaced qualified keywords in the maps you pass around your program.


#24

Oh I see. Well, as I said above, I believe that this is a limitation of Spec, not of the program - that’s spec saying “I will encourage you to use Clojure’s convention for namespacing keys and I refuse to cooperate with a system that doesn’t use this convention.”.

This could be solved by allowing spec to accept non-Clojure-namespaced keys. Maybe with an s/def-un macro:

(s/def-un :mc_cust_first_name string?)

This way the fact that the key needs to be unique would still be very ‘in the face’ of the user.


#25

I believe that this is a limitation of Spec

There’s no such restrictions. The name of the spec is not the same as the key on an associative data structure. Spec must be keyed by a namespaced keyword, but your data being specced need not. You can use s/keys with req-un and opt-un for that.

So you’d have:

(s/def :customer/customer_name string?)
(s/def :customer (s/keys :req-un [:customer/customer_name]))

Which would spec the following associative:

(s/valid :customer/customer_name {:customer_name "John Doe"})
true

So as long as your other systems can support a colon as their first character, you can keep their name the same accross Clojure. You’ll still need to coerce the keyword into a string and back at the boundary though.

There’s actually also a way to do this with string keys or any other using s/keys* I was told, but I haven’t tried it. That way you could even keep the type a string in Clojure, meaning you wouldn’t even need to coerce the types back and forth.

My recommendation if you were really interested would be to rally around JSON and model your data inside Clojure to always be valid JSON, that seems like it’ll give you the biggest reach. If you need more powerful modeling then JSON affords, I would look into Transit or ION as the next level up, both will have a good reach and compatibility across languages.


#26

you can use s/keys with req-un and opt-un for that.

It seems to me though that @chpill just demonstrated that req-un and opt-un have limitations that req and opt don’t have?

This seems to be confirmed by Spec’s rationale: “Note that this cannot convey the same power to unqualified keywords as have namespaced keywords - the resulting maps are not self-describing.”

So as long as your other systems can support a colon as their first character, you can keep their name the same accross Clojure.

I really don’t think the colon matters :slight_smile: it’s fine in practice if the key is :customer_first_name in Clojure and customer_first_name in JS / GraphQL / Postgres / ElasticSearch / whatever, they’re equivalent for most practical purposes (including text-based search, which is really what I want to emphasize here).


#27

I’d follow the conventions of the language or the format I’m serializing to.

For JS/JSON I’d convert namespaced keywords to camelCase with no hyphens or dots, strip out com.my-comp if it’s there. Convert to the Clojure convention if I ever got it back.


#28

I’ve been there (I just went through a massive refactoring of our JS code to use namespaced keys, because keeping track of the data was harder and harder). That’s essentially Approach 1. I strongly recommend against it. Here’s what I took away from this experience:

  1. The benefits of having namespaced / globally-unique keys outweighs the convenience of following JS convention for keys (or of having short keys for that matter)
  2. What matters is the ability to identify a key at first sight without any more context, and to perform whole-system searches of the uses of a key.

#29

Just an off the wall thought here that might be obvious;
If your primary concern is search-ability… consider thinking of the separator as a dot instead of an underscore or hypen. :slight_smile:

$ ag foo.bar
foo.txt
1:foo_bar
2:foo-bar

baz.cljs
64: (let [m {:foo.bar/booz {“baz%?” 2}}]

Instead of searching for my.company/foo-bar or my_company_foo_bar, if you just search for
"my.company.foo.bar" you will find all references (assuming your search supports regex, which most do). Regex matching “.” doesn’t care if you use underscores, hypens, slashes.

So maybe it’s worth thinking of the separator as . even if it’s not /shrug

For camel case it’s a bit harder to just think of it as . because you need .? to match:
ag foo.?bar
foo.txt
1:fooBar
2:foo_bar
3:foo-bar

Possible, but not fun.

I realize it is somewhat tangential to the discussion… I’m just offering this as a practical approach to surviving where various styles exist. :slight_smile:


#30

Thanks for the discussion. I am having exactly the same thoughts. For my purposes the benefits of namespaced keys are:

  1. reducing complexity from ill designed (“subjective”, “parochial”) nested document data and instead sharing flattened maps with longer keys. This avoids the problems of nested data including duplication of the same data item at different levels, ambiguous “deep” merging policies, loss of transparancy of which functions use which data items at the calling site, and terrible java support for nested data structures full stop.

  2. Defining a catalogue of data items (perhaps even collecting their sources and usage) principally to give BAs back a data “schema” of the kind they used to understand and enjoy with older RDBMS data, so that they can define constraints and invariants about their data items formally and specifically.

In our company we have an ADR that all externalised (i.e. wire or stored) data use JSON convention camel case keys with translation to language conventions as required (Clojure, Java, JS), so I guess we chose 2 and lost the ability to grep the codebase in one for each item. Still for me the nesting thing is more of a killer as data items are constantly being rehashed into different names and structures for convenience in every given context anyway.


#31

I like the idea 2. Specially because in Clojure we already use the - pattern everywhere. But I really like the idea of defining a standard about how to externalize it. It doesn’t have to be that pretty, just expressive enough to still have the name uniqueness. I think it’s safe to say we can rely on what JSON support (which in fact supports full strings, we can pull the entire keyword as-is there if we want, then it’s just matter of how each client wants to deserialize it).

An extra advantage of keep the full names on JSON (instead of throwing away the namespace), is that you could have a similar “spec db” on the client side, which could be used to transform that data in whatever format the reader wants it too, I think it’s bad that we lose information when drop the namespace.


#32

Hey @vvvvalvalval,

That’s an interesting problem. I’ve encountered it in specific circumstances and found workarounds, but never had to address it generally. These are just some random thoughts.

One solution I’ve used is to just use fully qualified keywords as strings (pr-str them and drop the “:”) for JSON keys and Postgres column names. In Postgres, you have to use the quoted identifier, which allows anything except the null character. Then you can output them and read them back in with a simple call to keyword. This should also work generally with languages that use hashmaps.

I’m not a fan of #1. There are too many collisions.

For approach #2, this is how Clojure manages to use hyphens, question marks, and bangs in symbol names, yet still use those to generate valid Java classnames. There’s a function called clojure.core/munge that does this systematically. It’s a one-way function, though, which suits the Clojure compiler’s purpose.

There could be something like munge, with similar conventions. One you might try is to use convert . to _ in the namespace, and - to _ in the name. The / could become __ (double underscore).

It’s not perfect, but it corresponds to other conventions I’ve seen. The real issues are what to do with hyphens in the namespace and dots in the name. They’re less common, but that would only make the times they do occur harder to debug.

A hybrid with #3 is probably the sanest. There’s no reason the namespaces of Clojure keywords need to correspond to Clojure’s lib namespaces (typical code files with an ns declaration at the top). I think with the profusion of Spec examples that use the :: syntax, we often get confused about that. :: is just a lazy way of adding a namespace. We should be thinking more about what the public, global, permanent name of a thing should be than whatever file it happens to be in when we write the code.

So, there’s no reason you can’t do this as a namespaced key:

org_clojureverse_user__first_name, which converts to :org_clojureverse_user/first_name.

That said, I’d carefully evaluate the systems that will need to use these names. I’d rather use JavaScript’s object["org.clojureverse.user/first-name"] syntax than have to come up with a translation system that can go both ways. It’s not like object.org_clojureverse_user__first_name is any less ugly :slight_smile:

Eric