I wrote an example of replacing multimethods by protocols

Adama · April 13, 2021, 10:36am

Hello Clojurists,

I’m learning Clojure and best way to learn is to try it out. I’m slightly confused about Protocols so I decided to write the following code.

Could you please take a look and tell me is this the primary use case for using Protocols (except performance reasons)? Are there any other use cases?

To explain what I’m doing:

I have a protocol Uri and a record Path (I could have another record Url since from terminology perspective both Url and Path are subset of Uri)
Path can be both in the form of plain text (string) or data structure (record) that holds additional details.
Btw. I’m not trying to write URI library - this is really just for testing and I don’t like examples with cars ;-).

Thank you.

Adam

(defn cut-string [s max-length]
  (if (> (count s) max-length)
    (subs s 0 max-length)
    s))

(defprotocol Uri
  (convert-to-string [this])
  (convert-to-string-with-max-length [this max-length])
  (get-length [this]))

(defrecord Path [value separator relative]
  Uri
  (convert-to-string [this]
    (:value this))
  (convert-to-string-with-max-length [this max-length]
    (cut-string (:value this) max-length))
  (get-length [this]
    (count (:value this))))

(extend-protocol Uri
  String
  (convert-to-string [this]
    this)
  (convert-to-string-with-max-length [this max-length]
    (cut-string this max-length))
  (get-length [this]
    (count this)))

(def some-path (Path. "/some/path", "/", false))

; ####################### using protocols #######################

(convert-to-string some-path)
;; => "/some/path"

(convert-to-string-with-max-length some-path 8)
;; => "/some/pa"

(get-length some-path)
;; => 10

(convert-to-string "different/path")
;; => "different/path"

(convert-to-string-with-max-length "different/path" 5)
;; => "diffe"

(get-length "different/path")
;; => 14

; ####################### using multimethod #######################

(defn get-uri-type [uri & _]
  (cond
    (string? uri) :plain
    (map? uri) :with-details))

(defmulti multimethod-convert-to-string get-uri-type)

(defmethod multimethod-convert-to-string :plain [uri]
  uri)

(defmethod multimethod-convert-to-string :with-details [uri]
  (:value uri))

(defmulti multimethod-convert-to-string-with-max-length get-uri-type)

(defmethod multimethod-convert-to-string-with-max-length :plain [uri max-length]
  (cut-string uri max-length))

(defmethod multimethod-convert-to-string-with-max-length :with-details [uri max-length]
  (cut-string (:value uri) max-length))

(defmulti multimethod-get-length get-uri-type)

(defmethod multimethod-get-length :plain [uri]
  (count uri))

(defmethod multimethod-get-length :with-details [uri]
  (count (:value uri)))

(multimethod-convert-to-string some-path)
;; => "/some/path"

(multimethod-convert-to-string-with-max-length some-path 3)
;; => "/so"

(multimethod-get-length some-path)
;; => 10

(multimethod-convert-to-string "different/path")
;; => "different/path"

(multimethod-convert-to-string-with-max-length "different/path" 11)
;; => "different/p"

(multimethod-get-length "different/path")
;; => 14

Adama · April 14, 2021, 1:40pm

I’m assuming there are no other use cases. Which is cool. I asked the question since it didn’t feel right yesterday but that’s probably because all that previous experience from OO langs that I have.

And looking at my code today, Protocols looks simpler / easier to write than multimethods so this is another advantage (not just performance) over multimethods for polymorphic functions.

Thanks!

magnus0re · April 14, 2021, 3:31pm

Hi @Adama, welcome to Clojureverse, I hope you like it here! Trying a language out is definitely the way to go if you want to learn.
Some learning sources, if you need them:
For typical comp-sci type exercises I would recommend the hackerrank and 4clojure sites, they have a lot of content. My current favorite book about Clojure is ‘the joy of Clojure’ (preview on manning), it provides useful insight into what mechanisms are good to use in specific situations (e.g. records and protocols are briefly covered in chapter 9)

I was just a few weeks ago reading about defrecord myself. I was writing syntactic sugar/wrappers for the java util concurrent queues.

When you refer to the fields of the record, you can use the names directly, it’s not necessary to use (:value this), you can just use value. Quoting the documentation:

the local environment includes only the named fields,
and those fields can be accessed directly.

So my opinion, (which you can of course disagree with), is that it is clearer to write the Path record as below.

(defrecord Path [^String value ^String separator ^Boolean relative]
   Uri
   (convert-to-string [this]
     value)
   (convert-to-string-with-max-length [this max-length]
     (cut-string value max-length))
   (get-length [this]
     (.length value))) ;; or (count value)

Adama · April 15, 2021, 11:41am

Thank you, nice to be here ;-).

I didn’t want to but this one since I considered it too old. 5 years is like a century in IT. But you changed my mind, thank you.

Disagree? This is sooo much better. Thank you VERY much.

The only thing I still don’t understand is why people don’t use Protocols more often. When browsing github I found several examples where people used multimethod with type based dispatch. In other words there was nothing in the code that would prevent them to use Protocols (like for example number larger than 1000). Maybe the thinking is that they need Records to use with Protocols which of course they don’t need.

Thank you @magnus0re

xfthhxk · April 16, 2021, 1:54am

Hi Adam,

Welcome

Yours is a good question about why in those instances of type based dispatch protocols weren’t used. I can’t comment on the authors’ choices but I can share with you my experience.

I came to Clojure after more than 13 years of Java and I started using defrecord because it seemed like the closest thing to classes in Java. Soon though thanks to blog posts and books I started using maps and as long as I’m lucky enough to be using Clojure and data then multimethods seem more natural because I’m most likely dispatching on a key’s value for example. I think it is because of that tendency along with less ceremony that multimethods could be the go to. Protocols are something I’ve reached for only a handful of times; mostly when forced to interact with Java classes.

One downside to using records at the REPL is that if you have an instance of the class somewhere and you eval the record definition again, a new class is generated and you’ll get a perplexing ClassCastExceptions with a message like foo.Bar is not an instance of foo.Bar (or something similar). Protocols might not suffer from this as long as you refer to the protocol instead of the generated interface (been a while so don’t quote me on it )

Amar

mjmeintjes · April 16, 2021, 3:35am

I avoid protocols as much as possible, for a few reasons:

The main reason: They interfere with easy code reloading and interactive/REPL-driven development - it is not easy to redefine the protocols, and leads to lots of irritating problems.
They don’t work with hash-maps, and leads to over use of types/defrecord (defrecord is another thing that I would recommend avoiding as much as possible, mainly because they don’t work with namespaced keywords)
They lead to a “closed” Object Orientated code style that I think misses a lot of the benefits of Clojure’s “open” Data Orientated code style.

I agree with @xfthhxk that protocols should be used mostly when working with Java classes, when building specific types of libraries, or when the performance impact of multimethods is important.

But, this is just my opinion, and I’m not sure whether it is the general consensus.

joinr · April 16, 2021, 3:47am

(defprotocol IBlah
  (blah [this]))

(extend-protocol IBlah
  clojure.lang.PersistentArrayMap
  (blah [this] (:blah this))
  clojure.lang.PersistentHashMap
  (blah [this] (:blah this)))

(blah {:blah "blee"})

user> (blah {:blah "blee"})
"blee"

namespaced keywords are overrated; just my opinion though.

joinr · April 16, 2021, 4:01am

Yeah, this is interesting. Especially since that’s exactly the use-case for protocols, and one which they are optimized for. Multimethods are very granular as well…if you want to bundle common functions together, instead of implementing a protocol with its associated functions, you now have to implement independent little grains. More flexibility though.

I’m not a big fan of the multimethod tax either…

(require '[criterium.core :as c])

(defprotocol IBlah
  (blah [this]))

(extend-protocol IBlah
  clojure.lang.PersistentArrayMap
  (blah [this] (this :blah ))
  clojure.lang.PersistentHashMap
  (blah [this] (this :blah)))

(defmulti  multi-blah (fn [coll] (type coll)))

(defmethod multi-blah
  clojure.lang.PersistentArrayMap [coll] (coll :blah))

(defrecord blahrec [x] IBlah (blah [this] x)) 

user> (let [b {:blah "blee"}] (c/quick-bench (multi-blah b)))
Evaluation count : 11926704 in 6 samples of 1987784 calls.
Execution time mean : 51.145613 ns
Execution time std-deviation : 1.438590 ns
Execution time lower quantile : 49.184024 ns ( 2.5%)
Execution time upper quantile : 52.368468 ns (97.5%)
Overhead used : 1.797220 ns
nil
user> (let [b {:blah "blee"}] (c/quick-bench (blah b)))
Evaluation count : 60342360 in 6 samples of 10057060 calls.
Execution time mean : 8.516917 ns
Execution time std-deviation : 0.203639 ns
Execution time lower quantile : 8.292597 ns ( 2.5%)
Execution time upper quantile : 8.702191 ns (97.5%)
Overhead used : 1.797220 ns
nil

user> (let [b (->blahrec "blee")] (c/quick-bench (blah b)))
Evaluation count : 106328184 in 6 samples of 17721364 calls.
Execution time mean : 3.824386 ns
Execution time std-deviation : 0.066555 ns
Execution time lower quantile : 3.732416 ns ( 2.5%)
Execution time upper quantile : 3.894830 ns (97.5%)
Overhead used : 1.797220 ns

There are some libraries like faster-multimethods that aim to reduce the dispatch cost though.

mjmeintjes · April 16, 2021, 6:05am

That’s interesting. To me namespaced keywords are one of the most important parts of how I structure our application. I very rarely use un-namespaced keywords.

joinr · April 16, 2021, 7:12am

I am pretty much on the opposite end over about a decade. Absent specific/mandated use cases like spec, datomic, or other libraries, they’ve never been appealing or even necessary for my work.

bsless · April 16, 2021, 10:17am

Another library worth mentioning with regards to multimethods is methodical which also claims to offer better performance.
In terms of use cases, I found I often reach for multimethods when writing parsers. Any other use case you found where they shine?

joinr · April 16, 2021, 11:27am

[methodical]…which also claims to offer better performance.

I noticed that the library I linked actually gets down to within protocol performance, substantially faster than legacy multimethods. The goal is different though, not to implement CLOS-like method combinations (I’m not a huge fan of them either), but to get multis fast so they are more widely used. It’s more toward the approach Julia took where everything is a generic function with possible hardcore type specialization and slick performance.

Any other use case you found where they shine?

Parsers are a familiar one. A lot of times projecting a sort of pseudo type system on top of semi-structured data, with a lot of variation in possible dispatch “types” indicates multimethods are the path of least resistance (and simplest extension). Also compiler/emitters (I think these would be…ugh…visitor patterns) are similarly flexible as multimethods (core.tools.analyzer goes this route). When you’re generally munging a lot of maps and dispatching based on contents (types do you no good), multis make sense - or even just simple functions with closed dispatch. I think some of the more sophisticated coercions are interesting, where you naturally have compound dispatch values going from type to type (like the stuff in clojure.java.io with its reader conversions).

I just did a scan through my repos, including some big 3rd party libs, and can pretty much count the number of defmultis on two (maybe six lol) hands. Maybe it’s more popular on the webdev side of the clojureverse, where I’m more of a tourist.

Adama · April 16, 2021, 12:38pm

In other words you’re most likely forced to use multimethods. I also realized that if I use type based dispatch that can be replaced by Protocol, there is a very good chance that a day later I’ll need to add something like number greater than or a key is present and I’ll need to completely rewrite Protocol to Multimethod. So the time saved on writing Protocol (less ceremony) will be wasted by a large code change.

Yeah, I need to stop thinking as OO programmer. It’s difficult to make things easy ;-).

This is actually a point for Multimethod since with multimethod you don’t have to independently deal with ArrayMap and HashMap. Based on my reading Clojure is about data “shape” instead of data “type” and maps surely have the same shape.

Thank you @xfthhxk , @joinr and @mjmeintjes

Adama · April 16, 2021, 12:55pm

And could you please scan for Protocols?
More than multis or both multis and Protocols are simply not in use and they rather use separate functions and arities?

joinr · April 16, 2021, 3:05pm

This is actually a point for Multimethod since with multimethod you don’t have to independently deal with ArrayMap and HashMap.

Meh. It’s trivial enough to cover down on those cases. In library code, it’s really not a big deal IME. If it were, I would macro-it and call it a day, but even with my extreme laziness I haven’t gone down that route. You still have to call your shots - in a sense - with multimethod dispatches an ensure you’re mapping on the type. If you do - as many examples I’ve seen do - the naive type or class dispatch, you’ll still have 2 dispatch values to define (or some hierarchy definitions). Shifting towards a custom function that detects a map implementation is useful though; I would not consider that a particular killer feature though.

If it was enough of a problem I’d probably generate the code, but it’s really not, especially at the library level. Protocols get points back for usage with reify, proxy, deftype, defrecord, and friends, as well as being inlined method definitions (vs. the stock extensions), so additional performance and ability to include them in type-like stuff (e.g. extends?). As a minor feature, the provide Java/CLR interfaces, so for potential interop purposes, maybe that’s useful (e.g. leveraging clojure from outside), although I haven’t leveraged it ever.

They interfere with easy code reloading and interactive/REPL-driven development - it is not easy to redefine the protocols, and leads to lots of irritating problems.

I forgot to respond to this one; I’ve had plenty of times when multimethods crapped out at the repl and refused to update with new definitions. Used to irritate me quite a bit in the early days, where libraries popped up before protocols were a thing (like early incanter designs) and basically acted as psuedo protocols (type-based dispatch). I learned that you can nuke the multimethod (assuming you control the ns) by unmapping its def and then reloading things as a last resort. I last ran into this problem sometime in early 2020, although it’s been sporadically “there” since I started messing with Clojure circa 2011.

The mentioned protocol messiness can also be worked around by keeping them in a separate ns. runtime definitions of things that implement said protocols will be subject to any changes, so if you pack them in a separate ns that’s not being reloaded (e.g. it’s not colocated with business logic and implementations or even records) getting into weird repl states where runtime data doesn’t support the “current” protocol is less likely. I view it mostly as a usage/design issue; it’s minor to me.

And could you please scan for Protocols?

I revise my earlier estimate due to ocular examination instead of leveraging the computer Here’s some usage data.

I globbed it into a lame bar chart

This is obviously what I had on-hand (including clojure, clojurescript, and some other 3rd party libs as well as my own stuff). It only looks at invocations, so no idea what the usage (defmethod, or protocol inlining, reify, or extension) looks like. I also don’t have visibility on simple functions that perform similar dispatching (there are plenty of closed functions that do similar stuff), but perhaps we don’t care about those too much anyway.

In the aggregate, I end up with

trend	count
even	14
multi-dominant	44
proto-dominant	100

Where “dominant” is the simple majority. There are likely less coarse classifiers or distance metrics, but this is at least simple.

Adama · April 16, 2021, 5:49pm

And it’s well visible so in case you change it you won’t forget to change the other line.

Super cool ;-). I thought you’ll just run grep -rI ... | wc -l and give us a result. This is much cooler and it’s really telling. Thank you.

joinr · April 16, 2021, 6:09pm

I hadn’t really though about it, but I wonder what a meta study of github repos would show w.r.t. protocol/multifn usage. Curious what it would say about problem domains, or even just preference. I am alas too occupied to perform such a study.

madbonkey · April 16, 2021, 9:41pm

I love multis for web dev. As you said, a proliferation of „type“-y dispatch, but so many concerns that types would be way more cumbersome than useful. Routing, access control, formats, representations, tons of tiny-but-named things. It’s not so much that you need a lot of them (for me, two hands would be plenty), or even at all probably, but they can often give a lot of mileage. You can easily „grow“ any function into a multi if needed, and go from there like you could’ve with any function in case even more sophistication is needed. I‘m just a big fan all around.

PEZ · April 17, 2021, 6:48am

I think @sogaiu might have a corpus of Clojure projects that might be mined. No idea how much work it would be though.

PEZ · April 17, 2021, 6:58am

In the Clojure Users group at LinkedIn I just found this paper by Stuart Sierra about the problem that that protocols solve, the Expression Problem.

Not sure how relevant for the particular OP question, but at least a bit, I think.