Declarative data manipulation (or DSLs) in Clojure

#1

Hey, everyone! Hopefully I post this question in a right place.

I have a project that allows users to declaratively create “rules” on data (hashmap): basically a set of predicates and actions.
E.g.: when value of field :summ-1 (of hashmap) less than value of :summ-2, assoc new field :result with value (+ :summ1 :summ2). Like formulas, with predicates, calculations and references to fields a la Excel spreadsheets.

I’ve started implementing these “rules” as multimethods, writing individual code for each possible predicate and action that user might need. But this approach has a big downside: for future extensions it requires to modify source code, recompile, etc.

I wonder if I can extend functionality and add my hashmap-predicates and hashmap-transformers on a runtime, preferrably in declarative manner (as data)? Perhaps I can use some suitable library/DSL? Query languages, DMLs? Logic programming?
I want to leverage something (powerful) instead of writing my own library of functions.

#2

That’s an interesting one. I’m afraid that a purely declarative solution will be tricky. It could be possible to encode your predicates and actions as data, but you might end up employing or implementing an entire DSL. Be it SQL, Datalog, Excel marcos, or XSLT. Or even snippets of JavaScript running in the same GraalVM.

If the data-driven approach is negotiable, there are some projects which might serve as an inspiration:

I’m curious what you’ll end up building!

#3

Thanks, will definitely check out both Clara Rules and Specter!
If not purely declarative, what approach is also viable, in your opinion? I’ve thought of saving clojure code in separate files, then requiring them and eval-ing in runtime, but that seems clunky and not-so-elegant.

#4

Have you considered putting .edn files on your classpath, and reading them with clojure.java.io/resource? That way they work both when you develop your application, and if you package a jar, without having to handle separate files.

Approach stolen from Integrant, which is well worth checking out if you’re curious about data-driven design.

#5

This shouldn’t be the case. You should be able to dynamically add defmethods to existing defmulti, at least I think so.

#6

The question is: what do I put in my edn files? If I put my own DSL, I need to implement it somehow, but if I can put some universal declarative thing like Datalog queries then I have a leverage of query engine and language.

#7

I think that’s two questions:

  1. Conceptually, what makes a good API design in this case?
  2. Technically, what are good solutions to hook a data-declared system into an implementation?

I’m not attempting to answer 1. DSL design is hard, for the reasons you mentioned. Perhaps not choose at all, but support multiple interpreters. Dunno.

But as for 2, Integrant and Duct (Integrant specialized for web applications) give a nice answer: map namespaced keys in your data to implementation in code. If I have the following Duct system declaration,

{:duct.profile/base
 {:duct.core/project-ns th.example2

  :duct.router/cascading
  [#ig/ref :th.example2.handler/example]

  :th.example2.handler/example
  {}}

 :duct.profile/dev   #duct/include "dev"
 :duct.profile/local #duct/include "local"
 :duct.profile/prod  {}

 :duct.module/logging {}
 :duct.module.web/api {}}

the #ref :th.example2.handler/example maps into the source code at th.example2.handler.example:

(ns th.example2.handler.example
  (:require [compojure.core :refer :all]
            [integrant.core :as ig]))

(defmethod ig/init-key :th.example2.handler/example [_ options]
  (context "/example" []
    (GET "/" []
      {:body {:example "data"}})))

I felt that you were asking about this in your original post. Did I misunderstand?

Teodor

1 Like
#8

Thanks for an elaborate answer, I’m actually familiar with Integrant. The thing is, I still need to implement those defmethod functions, which I try to not to.

So, I was asking about question #1 :smiley:

#9

As has been said, generic advice for „good design“ is very hard to give. Remember that you will have to put code that actually does what you want somewhere, and deliberating endlessly upfront about how different approaches might evolve won‘t get you very far. That’s not to say logic should be sprayed everywhere carelessly, of course, but the problem of premature optimization is not limited to performance.

Clojure‘s true power doesn‘t really come from having very little actual code, it derives from its future-oriented design: The language enables you to express your logic in ways that enable you to continuously evolve your codebase. Clojure codebases are very amenable to refactoring/restructuring/rewriting/iterating (however you might want to call it), and I‘ve found even mid-sized applications are easily supportable by me alone over years (something that would cost me so much time and effort in other tech stacks that it might easily become unfeasible for a freelancer).

Long story short, decide on a way, see how it develops (heh) and adjust along the way - it’s the perfect language to play around with little worry! :v:

Edit: Regarding DSL, to me personally that concept has always been a totally arbitrary line in the sand. Depending on how people understand/define DSLs, anything developers write might or might not be a DSL. I think everybody can agree that there will be domain-specific code in any application that deals with a problem domain at all, and whether you call that part of the codebase a DSL, I think matters not. In Clojure there’s multiple ways to express domain-specific logic in this „future proof“ mindset, and how much you make this a self-contained „language“ is up to your use-case. I‘d only recommend not straying too far into custom solution land, and away from leveraging language features.

2 Likes
#10

This is one of those things where the easy stuff is easy but the difficulty ramps fairly quickly. One of the problems is the order of operations (what if multiple rules match? in what order do they fire? what if the fn attached to one changes a value in a way that causes/prevents another rule from firing?). Logic programming systems (e.g. PROLOG, core.logic, &c) and rules engines (Clara) have well-researched algorithms for specifying these kinds of things.

Another way of approaching the problem that might be amenable to your use case is to do it like spreadsheets do. They typically implement a dataflow architecture that makes it a bit easier for end users to work out how the dependency graph is structured and to guess what’s going to happen at runtime. You can look at this paper for an example of an implementation of this sort of system in CLJS.

2 Likes
#11

Ironhide wraps Spectre with a data API https://github.com/HealthSamurai/ironhide

1 Like