How to build a DSL in clojure

Hi there,

I am looking into building a simple query DSL in clojure. The main idea is to allow my users to filter out which result they want to see. For instance (and (username "boris") (country :uk)) would only display results about people named “boris” in the “uk”.

As the query has been produced by the user, I would like to sanitize the query. I would not like someone querying the system with (username (slurp "http://s3/large-file")). Consequently, I suppose, I will need to whitelist the allowed functions.

For now, the above would be enough. But I fell there is something wrong with a system that does not allow you to name things. It would be marvelous if the DSL would allow using scoped def. Something a little bit like…

(def boris-uk
   (and (username "boris") (country :uk)))

(def donald-us 
   (and (username "donald") (country :us)))

(or boris-uk donald-us)

Currently, the only solution I have in mind is to load the dsl as a data structure and then inspect it. According to my taste, this is very ugly. On the internets, Lisp is often mention along side dsl but there is surprisingly few document about dsl and clojure out there. Please help me…

Cheers,

Didier

I’d suggest using data for DSL and parsing it using clojure.edn/read-string (not clojure.core/read-string!) and just use data structures and simple symbols:

(or
 {:username "boris"
  :country :uk}
 {:username "donald"
  :country :us})

Reading/evaluating form is a security nightmare. Do you know clojure allows evaluation code during read-time with undocumented reader conditional?

(read-string "(+ 1 2 3)") => (+ 1 2 3)
(read-string "(+ 1 2 #=(+ 100 100))") => (+ 1 2 200) ;; even before you validate and eval!

clojure.edn/read-string solve the reading problem. My problem is pass that, once the dsl has been read, I need to evaluate it in a certain context. If we continue with the original example, the query (and (username "boris") (country :uk)) could be used as a filter as bellow.

(def query '(and (username "boris") (country :uk)))
(def safe-predicate (parse-dsl query))
(filter safe-predicate data)

This parse-dsl should return a trusted clojure function.

I personally wouldn’t start with evaluating parsed form, instead, I would treat it as data that is manually transformed to a predicate, like that:

(defn parse-dsl [q]
  (let [[dispatch & rest] q]
    (cond
      (= 'and dispatch) (apply every-pred (map parse-dsl rest))
      (symbol? dispatch) #(= (first rest) ((keyword dispatch) %))
      :else (throw (ex-info "Can't parse query" {:dispatch dispatch :rest rest})))))

If you want to allow user evaluate some clojure, I’d recommend to have a look at sci, which might be safe.

thanks, i will deeply at sci. it sure solves my problem.

You could make a macro that took the quoted parsed string from edn/read-string and returned the Clojure code you want executed.

In a namespace, you could then implement the def, or, and, etc. of your choice.

The macro could return calls to (your-namespace/def ...), (your-namespace/and ...) etc.

Now this isn’t the safest thing. Sanitizing this won’t be that easy either. But depending how much you trust the user, it could be good enough.

You could look at Clojail https://github.com/Raynes/clojail/blob/master/README.markdown to try and make it safer. Though it does blacklisting, not sure if there is one that does whitelisting.

Or just use functions instead of data. This has many advantages:

  • no need to write and maintain a parser
  • ad-hoc default values.
  • ad-hoc compositionality.
  • ad-hoc control-flow and conditions.
  • ad-hoc aspect-orientation (injecting additional behaviors before/after/around core behaviors).

If you want to explore these ideas further, you can read this gist I wrote a while back.