Multimethods on records and integers

Jim_Newton · November 4, 2019, 2:39pm

I have an application which I originally implemented in Common Lisp, and recently re-implemented in Scala. I’d like to try to re-re-implement this in Clojure. This will be my first Clojure program.

Most of the parts of this program will be straightforward to implement. However, there are a few things which worry me as I don’t yet know enough about Clojure to envision an implementation. Before I start, I have some questions to see whether my strategy will work.

I plan to define a record call Bdd which has three fields: an Integer, and two Bdd objects. My application will allocate many 100s of thousands of such objects and manipulate them using an algebra which I will define.

Can I define a print-object or toString method to control how the system prints a Bdd instance? (of course I don’t care that it be a method, just some way to control how my object is serialised for printing in the REPL etc)
Can I define a multi-method which accepts two arguments? then define 4 methods for each of (integer,integer), (integer,Bdd), (Bdd,integer), (Bdd,Bdd) ? I read about multi-methods a bit, but it is not clear how to make them work for Integers and records.
There are exactly two distinguished Bdd objects, BddTrue and BddFalse, which do not have any fields, Normally I’d define these as an algebraic data type (which I did in Scala and in Common Lisp). What’s the approach in Clojure? In Scala and Common Lisp these distinguished BddTrue and BddFalse objects satisfy an is-a relationship. BddFalse is-a Bdd and BddTrue is-a Bdd.
I never want to compare two Bdd objects by their content, only by their identity. They should only be considered equal if they are the exact same object? And I will enforce in the factory function that never are to Bdd instances ever created which have the same content. Do I need to always remember to use identical? or can I somehow tell the system what equal means on instances of this record?
I need a weak-value hash table to store the objects, to assure that never do two get allocated with the same content. In Scala I’m using the java class org.jboss.util.collection.WeakValueHashMap. In The CL implementation I used the SBCL weak-hash-table which just did what I wanted automatically. What’s the correct way to do this in Clojure?

Thanks to anyone who can offer some advise on some/any of these points before I begin.

didibus · November 4, 2019, 4:13pm

I’d say your overall description doesn’t sound very idiomatic for Clojure. Where we tend to prefer using data directly, instead of wrapping things in custom data types.

But, since I don’t know the context of this Bdd, I can’t really speak too if in your case it really is the right approach or not. I would recommend reading this: https://clojure.org/about/state to get a feel for it.

You can extend print-method
You would just make use of the type fn inside your dispatch fn for each arguments.
You can have hierarchies on multi-methods, see here: https://clojure.org/reference/multimethods .
For that, you will want to use deftype. Read up on deftype and defprotocols here: https://clojure.org/reference/datatypes
You can just use that same jboss map.

Jim_Newton · November 4, 2019, 4:31pm

Interesting, so what is the idiomatic way to define objects such as graphs or recursive data structures? For an easy example, a binary tree is made of of nodes which have some data and either exactly two children or no children. Internal nodes have two children which themselves are nodes having the same structure, and at the bottom of the tree there are leaf-nodes which have no children. When I read about structs and records in the closure specification it sounds like pretty close to the right tool.

Webdev_Tory · November 4, 2019, 10:13pm

In the case of a binary search tree, I thought this a decent implementation using records, as you pointed out: https://eddmann.com/posts/binary-search-trees-in-clojure/ .

plexus · November 5, 2019, 8:24am

The idiomatic thing would be to start with just vectors and maps.

{:data "..."
 :children [{:data "...", :children [,,,]}]}

You can still have polymorphism over these with multimethods, e.g. based on a :type key, or on the fact that a node has children or not, or on some metadata you add. You can still control the printing by adding :type metadata and extending print-method.

You can move on to defrecord and defprotocol from there. The polymorphism is more limited (you can only dispatch on the type), but it’s faster (because it uses JVM class based dispatching), and you can extend your protocols to other types, including built-in and third party types.

If you want full control you can drop down to deftype. This basically creates a new class from scratch. In this case you’re moving out of the realm of Clojure persistent data structures, so your custom type will not work with the functions in clojure.core, unless you take care to implement all the necessary interfaces yourself. This would allow you to define your own equality semantics (through Object#equals), although if you intentionally are comparing by identity then I would use identical? in your code, to make that clear.

Update: loom is a generic graph library, you might want to look there for inspiration. They use records and protocols.

Jim_Newton · November 5, 2019, 10:30am

Thanks everyone for the kind responses. When I start looking at this in anger, I’m sure I’ll have more questions, and I’m also sure that many of the questions will just answer themselves.

didibus · November 5, 2019, 10:30pm

Have you read over: Clojure - Values and Change: Clojure’s approach to Identity and State ? It covers that a little bit.

Basically, Clojure distinguishes between language constructs, and domain constructs. If you just need to model domain data in a particular way, you should try and use existing language constructs, meaning, the existing data-structures, the ones in core mostly: PersistentList, PersistentMap, PersistentVector, PersistentSet, PersistentQueue, etc. In the case they don’t offer the operational properties you require (performance, memory, locking, etc.) you can use those from Java, and if that’s not good enough, and there’s no existing library for it, then you can build your own using deftype and protocols, but like others said, that’s a lot of work, and won’t naturally work with existing functions, unless you implement the protocols for them as well.

I know loom and ubergraph are both pretty good and complete graph data structures if you needed that.

But, like I said, maybe you do have a situation where you need a custom structure to get the operational properties you’re looking for. So for that, you might need to get your hands dirty with deftype.

But say you just needed hierarchical data representation, you’d just nest maps and vectors:

{:dogs [{:breed :labrador :name "Shadow"}
        {:breed :corgi :name "Eliot"}]
 :people [{:name "Jonny"} {:name "Bobby"}]}

And if you needed tree like ownership, you can use hierarchical representation:

{:people [{:name "Jonny"
           :owns [{:breed :labrador :name "Shadow"}]}
          {:name "Bobby"
           :owns [{:breed :corgi :name "Eliot"}]}]}

Or a more graph/relational based one:

{:dogs [{:breed :labrador :name "Shadow"}
        {:breed :corgi :name "Eliot"}]
 :people [{:name "Jonny"} {:name "Bobby"}
 :ownership {"Jonny" ["Shadow"]
             "Bobby" ["Eliot"]}

And now you would model changes using Atoms, Refs or Agents. For example, with Atoms, say ownership can change over time:

{:dogs [{:breed :labrador :name "Shadow"}
        {:breed :corgi :name "Eliot"}]
 :people [{:name "Jonny"} {:name "Bobby"}
 :ownership (atom {"Jonny" ["Shadow"]
                   "Bobby" ["Eliot"])}

Now you can modify ownership relations:

(update data :ownership dissoc "Bobby")

And build functions over that such as: add-owner, remove-owner, etc.

Often it ends up simpler to just wrap the world over time, aka, the whole data, so you could just have:

(def data
  (atom
    {:dogs [{:breed :labrador :name "Shadow"}
            {:breed :corgi :name "Eliot"}]
     :people [{:name "Jonny"} {:name "Bobby"}
     :ownership {"Jonny" ["Shadow"]
                 "Bobby" ["Eliot"])}))

Now you get lock free concurrency on the whole data, and you can perform atomic changes, so if say you want a kill-dog fn, you can have it remove the dog from :dogs and remove all associated :ownership relations.

And you can start to generalize this a bit.

(def data
  (atom
    {:nodes {1 {:type :dog :name "Eliot" :breed :corgi}
             2 {:type :person :name "Bobby"}
             3 {:type :person :name "John"}
             4 {:type :dog :name "Shadow" :breed :labrador}
     :edges {2 [[:owns 1]
             3 [[:owns 4]]}}

So now you have something more akin to a directed graph with named edges which you can atomically update. Obviously, the more general you get here, the more you might just want to lean for an in-memory DB at some point, like DataScript, or some full fledged graph data-structure like ubergraph. It’ll all depend on the types of access and modification patterns that you require.

But, if this doesn’t have the operational characteristics you need, like not performant enough, too memory hungry, etc. Well, you are falling off Clojure’s ideal path, and you’ll need to resort to lower level stuff like I said before which is either deftype and protocols, or when you really need the utmost performance, java.

system · May 6, 2020, 10:30am

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.