Don't quite understand rules for namespacing keywords

I’m a Clojure beginner (coming from Python and Rust) trying to figure out the mental model for when and how keywords should be namespaced.

tl;dr

It seems like there’s a lot of different concerns to take into account, and since I have no experience in the Clojure ecosystem I’m not sure about the idioms or best practices. So, my questions:

  • When would I use a namespaced keyword instead of a bare keyword?
  • When would I use a namespaced keyword vs. a fully qualified keyword?
  • What forms of keywords should my specs accept?

I recognize that there probably aren’t strict rules or a clear cut answer, but I’d like to hear general thoughts from experienced Clojurists. Which concerns are valid, but don’t come up in practice? What works most of the time, but will cause trouble down the line?

Examples to think about

When I refer to a keyword, I’ve seen essentially three different options: (1) a bare keyword, (2) a keyword namespaced by the entity being modeled, and (3) a keyword with a fully qualified namespace.

:foo
:entity/foo
:com.zmitchell.lib/foo

I can see why you would want to namespace a keyword, the canonical example being two keywords that have the same name, but different semantics (:id in the example below):

(def user {:id 0 :username "zmitchell"})
(def employer {:id 14 :address "The Moon"})

You don’t want to accidentally accept a user id when you’re asking for an employer id. That said, I see bare keywords all over the place in posts, articles, etc, so I’m not sure if they’re actually avoided in practice.

I haven’t done much actual programming with Clojure yet, but I assume that the fully qualified keywords serve a similar purpose e.g. preventing two different users from different namespaces from conflicting with one another.

In one of the examples of the spec guide there is a ::keys, which doesn’t make sense to me.

(defn run-query [service query]
  (let [{::keys [result error]} (invoke-service service {::query query})]
    (or result error)))

Why do/would you need ::keys here instead of :keys?

To make matters even more complicated, it looks like you can have namespaced maps?

user=> (def mymap {::foo 0 ::bar 1})
#'user/mymap
user=> mymap
#:user{:foo 0, :bar 1}

I’ve never seen a map with what looks like a keyword in front.

I’ve also read that namespaced keywords can be an issue when serializing to JSON, writing to a database, etc.

Links

These are posts or articles I came across while trying to make sense of this on my own:

5 Likes

I don’t really have “answers” per se in terms of proscription/prescription, but I can supply some context and my internal practices.

Qualified keywords have been a feature in clojure since pretty much the beginning (or at least very close to 1.0 days).

In my observation over about a decade of use and following the community (and doing training with cognitect on both spec and datomic), I think they really emerged when datomic development increased and specs were concurrently brought online. The goal for qualified keywords has always been to provide an extra dimension of information for keys to reduce the chances of collision in the “information” realm. Instead of the nominal association of :id to a value, they let you be far more specific and define functions that operate exactly on “your” information. In the early days, these came up during library and api design, and they served as a means of packing library conventions into a map with some guarantee that user-facing code wouldn’t accidentally stomp on the library’s conventions for information. So things like having a :my.lib/id could co-exist with the potential (perhaps likely) existence of a general :id key and life could go on. This provides an accretive information model.

Rich Hickey’s talk on Effective Programs discusses these design goals circa slides 28/29, and 35.

We have this namespace qualification. If you follow the conventions, which unfortunately a lot of Clojure libraries are not yet doing, of this reversed domain name system, which is the same as Java’s, all Clojure names are conflict-free not only with other Clojure names, but with Java names. That’s a fantastically good idea, and it’s similar to the idea in RDF of using URIs for names.
And the aliases help to (?) make this less burdensome

When datomic came on the scene, I recall the namespace qualified stuff really becoming front-and-center. The database leverages qualified keywords all over the place, including schema definition and the like. It’s yet another way to organize your fact partitions for tuples (datoms) that are stored and queried. This is a nice accretive information model since you can add “similar but different” facts later without really colliding. You can handle a :semester/year alongside a :star-trek/year without any problem and be fine querying across the spectrum. The data for your “situated program” (as Hickey describes) can then go on to flexibly accommodate new information models without disrupting legacy stuff.

Then spec introduced them as an implicit requirement (with ways to spec unqualified things though). This made sense to me since you will likely have a great many specs with similar nominal inputs, but different semantics. Say spec’ing a generic map that defines elements of our information model for a person. We could have :name, :age, :trek-fan, as unqualified keys in a simple information model. So then defining a spec that works across all maps with these keys (a fairly broad structural type) is feasible. At the very least, we’d need to name the spec and be able to compose it with others. Using qualified keywords for spec registry helps ensure a lack of collision when defining multiple specs and recalling them later. The same granular information model can then be applied to the person data we’d like to define in our own little universe, without stomping on more general concepts of maps with keys.

So we could define qualified keys (this is actually the preference in spec), and have :my.ns.person/name :my.ns.person/age :my.ns.person/trek-fan as the keys, which for convenience sake has reader support (and destructuring support) via the

#:my.ns.person{:name "Spock" :age 147 :trek-fan true}

to keep the familiar unqualified map keys…in a “local” qualified context. Notably, we can still easily project this onto an unqualified map - if we want to accept some information loss, and we can still add similar keys to the map from other information models, like :other.ns/name. There’s quite a bit of flexibility while retaining specificity.

These days I’m still not using qualified keywords outside of specs and minor stuff. I simply haven’t needed to go beyond qualified keywords for my information models. I can see the use though, and I do partake when necessary or in the narrow places where they improve my convenience, but I otherwise tend to approach them like I would defn vs. defn- public/private functions. I tend to just expose everything and partition stuff into namespaces a bit more. I’m not saying this is the best way for you, but to propose that it has remained effective and fairly elegant for me for about a decade.

Given the prevalence of things like spec-tools, and their data specs which exist to help unify the specing of qualified and unqualified maps (which spec v1 makes kind of a chore to do), “qualify everything by instinct” is perhaps not a pragmatic or needed default.

When I think about it, if there’s a push to make the primary mode of operation “everything is ‘just data’” or “just maps” paired with functions that transform said data, then the richness and flexibility of what you can express in this mode is proportional to how flexibly you can organize information. Naming (of keys into maps) is then the principle means of organization here, so it needs to be sufficiently robust to allow complex information models that don’t collide.

7 Likes

tl;dr

  • When would I use a namespaced keyword instead of a bare keyword?
    • When you need to avoid collisions and define grouping within a flat layout.
  • When would I use a namespaced keyword vs. a fully qualified keyword?
    • When the risk of collision even for normal namespaced keywords is too high so you need to include the full package name.
    • When you need to encode lineage.
  • What forms of keywords should my specs accept?
    • Unqualified for the majority of cases, unless you have use cases that fall into the “When should I use a namespace or fully qualified package namespace”

Details

Most of the time, normal unqualified keywords are good enough.

This is because you’d generally have them in a map, which already seperates them by logical grouping.

(def user {:name "John", :email "jj@gmail.com"})

Since you’ve put the keywords inside a map inside the user variable, you know this map is for user data. And can use it like:

(:name user)
;;=> "John"

So in common usage, you rarely need to namespace the keyword itself, because they’ll already exist “within” something else that names the group they belong too.

Another reason is that within your own application, you don’t tend to have clashes in keyword names within the context they are used. For example, if a function names its arguments with keywords, it’s very rare you’d need two arguments to have the same name. Here’s an example:

(defn checkout
  [user cart & {:keys [coupon sale gift?]}]
  ...)

(checkout user cart :coupon "1345" :gift? true)

This adds up to the fact that unqualified keywords are most common.

Now, where namespace keywords start to make sense is when names can clash or lineage matters.

One example of that is Clojure Spec.

Specs are registered in a global map and all specs are stored in the map in a flatten layout.

;; Simplified view for example purpose
(def specs
  {:my.lib/user ...
   :other.lib/user ...
   :my.lib.user/name ...
   :user/name ...})

Because of that, you can see that the layout is quite prone to name clashes, and it doesn’t have an inherent grouping. That’s why the use of qualified keywords is needed, to avoid collisions and to group the keys together (which the flat map layout doesn’t do).

In this scenario, the chance of collision is so high, that something like :user/name still is at risk of collision, as you could imagine this being quite common. That’s why for specs you might want to actually use package + namespace name qualified keywords like :com.foo.lib.user/name and such.

But because typing all that is annoying, syntax sugar was created in the form of ::user/name which automatically expands the namespace to be the alias of the file namespace qualified location.

That said, this sugar turned out to be kind of confusing and accidentally coupled the keyword to its actual location in the source code, so if you were to ever move the keyword elsewhere (in a refactoring), your keyword namespaces would break.

What people really want is to create namespaces for their keywords to prevent collision, provide lineage and group related keys in flat layouts. That’s why in 1.11 they are most likely going to have a lightweight aliasing system that is decoupled from files.

On that topic, let’s discuss lineage. Now in some applications, keywords go beyond the scope of the app, it is sent to a database, or to some other service through an API, a file, etc. Now that can also create collisions, if say you were to store keywords globally between applications somewhere in a flat layout (like Datomic does). But even if you didn’t, you might want to know where the data came from, like sure this is a “user”, but which kind? Produced by what system? The namespace can be used for that as well: :com.foo.service-a.user/name tells you it’s the user/name from service-a.

Finally grouping on flat layouts or non collated layouts, like if you have keywords in different places and want to join/group them?

{:user/name
 :user/address
 :shipping/address
 :shipping/name}

Here the namespace allows you to avoid collision, but also group the name and address related to the user versus the ones related to the shipping.

If you had them seperate:

{:user/name
  :user/address}

 {:shipping/address
  :shipping/name}

And wanted to join them, the namespace allows you to do so without collision and with preserving their grouping information even though they will be joined in a flat layout.

Ok, that’s all the use case I can think of. Now one last thing, because aliasing is cumbersome right now, and because programers are lazy, typing long namespace everywhere didn’t really become popular. So even when things leave the application boundary, people don’t always care about collisions if the system is still mostly under their control.

{:service-a.user/name}
{:service-b.user/name}

So a lot of people don’t fully put the full package, like the whole :com.foo... in there and only add a small additional prefix.

Another issue of aliasing is they don’t support grouping on top of alias.

(require '[com.foo.app :as app])

;; This doesn't work, it doesn't become :com.foo.app.user/name
{::app.user/name "bar")

So you’re having to create an alias for every group, and that again is tedious, so again people tend to minimize the use of long package qualifiers.

Finally, another issue is that for keywords that leave the scope of your app, well, a keyword might be a bad choice of representation for things that leave your app, because keywords might not be friendly to databases, or to JavaScript front-ends, or to Java APIs, etc. Outside Clojure, it becomes tricky, what do you do with a namespace keyword? Most of the time keywords are mapped to strings in other languages, and strings don’t have the concept of a namespace, so where do you put the namespace?

Because keywords don’t map well to types in other non-Clojure systems, when a keyword is to leave your app boundary, it might actually be easier to use unqualified keywords.

For all these reasons, in the end I think beyond Specs which have a global collision problem and need a grouping mechanism since they have a flat layout, there wasn’t a lot of pros to namespaced keywords and there are too many cons, which is why most people still just use unqualified keywords for the most part.

7 Likes

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.