How are clojurians handling control flow on their projects?

seancorfield · July 22, 2021, 4:38am

At least that confirms it is nothing directly like Engine or what we’re currently doing – so I gleaned that amount from the docs, correctly! – although @mjmeintjes goes on to show how to achieve a similar thing (to what we’re currently doing) using Missionary which I certainly would not have gotten from docs/repo.

Missionary sounds fascinating, now that you’ve elaborated on what it is intended to do, and I’m certainly interested in alternatives to core.async Photon also sounds fascinating so I’ll have to put that on my “reading list” when you release it.

Based on this, I’ll have a play with Missionary. Thank you!

stuartstein777 · July 22, 2021, 7:48pm

Are exceptions not slow in Java.

I’m coming from .NET background, and using exceptions for flow control would be a big NO! Exceptions are incredibly slow.

for example, on my machine, checking if a file exists 1million times v trying to read file and see if it throws exception.

Checking if it exists takes 14seconds
FileNotFoundException 32 seconds

Parsing 1 million integers 00:00:00.0026347
Parsing 1 million integers where it might throw an exception: 00:00:05.1844392

Exceptions make a stack trace, unwind the stack etc. This is a slow operation.

I’m uncomfortable using exceptions / try catch in clojure because of my experience from .NET of how slow exceptions are, but maybe I shouldn’t be?

seancorfield · July 22, 2021, 8:55pm

Exceptions are designed for “exceptional” situations and shouldn’t be used for regular “flow of control” (as I noted above).

If you “expect” a file to be missing, use .exists() on the File object. If the file being missing means that you can’t continue, throw an exception.

If you “expect” incoming data to be parsable as integers, just use Long/parseLong and let it throw an exception if you get bad data. If you “expect” to get some bad data and you can do something about it (such leaving it as a string value or converting it to zero), then maybe it’s worth doing some check on the input to avoid having to try/catch around Long/parseLong.

There are ways to construct Exception objects without the overhead of the stack trace etc but if you’re not (ab)using exceptions for “flow of control”, that shouldn’t be necessary.

With the Clojure CLI and -X option, for example, the way to have a function cause clojure to exit with a non-zero status is specifically to throw an exception: that says “I failed! I can tell you why but I can’t do anything about it!” so they definitely have their place.

mauricioszabo · July 23, 2021, 7:35pm

Well, I just saw this interesting topic - lots of great ideas here!

So, most of my personal projects are ClojureScript, and I tend to use promesa. Probably, if you’re using ClojureScript, most (if not all) your side-effects functions will be async in some way, and promesa integrates really well with Javascript’s promises, so that’s my “to-go” library. It also handles errors beautifully, so that’s another plus.

On the other hand, if I do have lots of inputs in a function that can come from side-effects (like read data from a database, then other piece from HTTP, then something else) I would use pathom. It also handles errors in an interesting way, and even better, on pathom3 you can define multiple “paths” from your data (so if something fails, it’ll try another path). But that’s just for “resolving data”, not by “saving you data in multiple places” or “provoking multiple mutations”.

didibus · July 25, 2021, 5:05am

@wcalderipe Okay I tried to come up with a more complex example, here’s the gist for it: Example of a complex business process to implement in Clojure. Please link to your solutions for alternative way to implement the same in Clojure (or other languages). · GitHub

I think it be interesting to see what are different ways to implement that same example in Clojure (or even in other languages).

@dustingetz For ClojureScript that looks really interesting, I’ll keep an eye out for it.

@mjmeintjes Using Missionary for control flow is an interesting angle, but if you don’t have async requirements, would it still be a good way to do it? Do you feel up to giving it a try with my example, and re-write it using Missionary instead?

bsless · July 25, 2021, 11:50am

Two approaches I’m interested in which have similar characteristics involve reifying the data flow in the program in some way.
One approach which also handles concurrency is using core.async pipelines. It’s also pretty simple to build a DAG representation which can be compiled to a running system.
I don’t think I’ve seen solutions tackling this approach yet, but when I ran it by colleagues they said it feels hard to conceptualize. Could be because it connects what (the function to execute) with how (which pipeline, etc.). It’s also pretty noisy to have to consider backpressure, and splits in the data flow make it hard to track.
Its counterpart is sort of inverted, which is to use state machines.
If the transition between states is defined by a pair of functions, one to get the next state and the other emits effects, we can build a pure, reactive system. It feels like it has a lot in common with the ideas Dustin mentioned.
A state machine can accurately represent the flow of data in the system and completely separates concerns of how/when from what. We can build elaborate and efficient execution models on top of it.
This idea is still rather unformed but I wonder how far it can be taken. Can an entire application be built on top of it?
edit: This definitely ties to @kumarshantanu’s call to action to build better machines. We still haven’t found the right level of abstraction and language to describe them.

wcalderipe · July 26, 2021, 7:13am

@mauricioszabo, we’ve tried to handle promises with vanilla cljs, but that didn’t work well. So, a while ago, we’ve moved to funcool/promesa, which has been helping a lot.

;; Even if validate-input is a pure function you've to wrap the return into a
;; promise to kick-off the pipeline and benefit from its then/catch/finally handlers. 
;; That's a bummer!
(-> (p/create (fn [resolve] (resolve (validate-input input)))) 
    (p/then perform-data-read-somewhere)
    (p/then protect-business-rules)
    (p/then transform-data)
    (p/then save-transformed-data)
    (p/catch ...))

As raised in this thread, my first example is quite limiting. promesa work for it, and still, we lose individual error handling because p/then doesn’t implement .then(_, onRejected) from JavaScript.

I’m interested to read your thoughts on this and how you folks are using promesa over for control flow a pipeline.

@didibus great… thanks for taking the time to write it down and sharing it with us.

Thanks for taking the time to write it down.

I’ll try to post an example using interceptors with metosin/sieppari later this week.

I’m curious to read how folks would approach that scenario with missionary and Sean’s queue of thunks.

danbunea · July 26, 2021, 11:59am

This is a pretty long topic, one that I happen to be particularly interested in, too. In fact, I did an entire talk about this at ClojureD last year (maybe not the best title):

Where I tried to demo a few patterns I saw. Using exceptions or not. A pipe handling using exceptions would be like:

(defn offer-by-id [request-id]
  (safe #(-> {:id request-id}
             validate-id
             find-offer-by-id
             json-response)))

or:

pipe overflow

Which is a similar mechanism to what you describe above.

bhurlow · July 29, 2021, 3:26pm

Wow great thread. There’s also a post in clojureverse introducing missionary with some additional examples which I found helpful here

leonoel · July 30, 2021, 6:12pm

Interesting problem, here is my solution.

tdrencak · August 1, 2021, 7:12pm

In game AI industry Behavior trees (BT) are considered to be de-factor standard for control flow. Basically it’s a tree of operations. There are few built-in operations like conditional branching, loop, retry, sequence, parallel processing etc and programmer then adds action blocks.

Nice feature of BT is that they compose very well and subtrees do not need to know about their surrounding. Node only does the action and then passes the control to the parent (either success or fail). Node can also suspend the flow for asynchronous processing and resume later on another signal (timer, event etc).

All of the above mentioned features gives you the full power of reusable components and error processing. Debugging can be done through inspecting the log, you can see all the steps taken and investigate the problem.

There are already couple of libraries in clojure, but it’s not very hard to come up with something decent if they don’t suit you. I have personally used this one for inspiration GitHub - cark/cark.behavior-tree: A functional behavior tree implementation.

One problem I had with BT was if there was a lot of backtracking (undo or compensation in saga). That required a lot of branching in BT. Very promising was in my case usage of Hierarchical State machines (Statecharts) combined with BT nodes, which provided easy flow composition of BT and backtracking of state hierarchies.

Example of BT:

[:if #(condition....)
 [:do-action-1]
 [:do-action-2]]

;; repeat the sequence until success is achieved, can be used for infinite retries
[:until-success
  [:sequence
   [:action-1]
   [:action-2]]

;; short circuit for first success operation
[:select
  [:action-1]
  [:action-2]
  ...
  [:action-n]]

;; async workflow
[:sequence
  [:action-1]
  [:action-2]
  ;; park here and wait for :my-event
  [:on-event {:event :my-event}
   [:sequence
    [:action-3]
    [:action-4]]]

didibus · August 1, 2021, 7:29pm

Interesting model, how is dataflow handled though? It seems this relies on a global state which each action would mutate?

tdrencak · August 1, 2021, 7:52pm

BT interpreter takes both BT state and BT description, so it’s not global.

e.g. simplified version (execute definition state command) => new-state

BT state is state of each node which can be :success, :fail, :running (for async) or implicit :waiting. You can store the state in DB and correlate it by some ID to have full fledged workflow execution engine.

Action can be either stateful or stateless (provided as effect description). In case of stateless effects, BT is first run to get a collection of effects and then effects are executed. State can be persisted before or after depending on transactional guarantees.

didibus · August 1, 2021, 8:06pm

I should have said “shared state” to be more precise.

So the actions will manipulate a shared state object shared with the whole BT and all other actions? So they won’t receive input from arguments? And return results from output?

That means each action needs to know where and how to find the inputs they need from the BT state right? And make sure to put their output in the right places on it as well? Or am I missing something?

tdrencak · August 1, 2021, 9:29pm

Yes, you use shared state if you need to communicate between actions. It’s called blackboard in AI terminology and should be well defined.

Both inputs and outputs should be specified in terms of this state. I usually put inputs as commands into specified queue and gather effects (output) into a collection:

(let [ctx (-> {}
              (update :input (fnil conj []) {:type :do-something, :param-1 1, :param-2 2})
              (execute-bt bt-definition))] ;; generates :effects
      (doseq [fx (:effects ctx)]
         (execute-fx fx)))

If the BT is long running and can be parked, then I store state in the DB and retrieve it as a first step with some correlation/request id.

To avoid spaghetti dependencies, nodes are usually parameterized and exact shared paths are passed from above, so dependencies should be easily spotted and actions reused e.g.:

[:sequence
  [:action-1 {:output [:a :b]}]
  [:action-2 {:input [:a :b]}]]

didibus · August 1, 2021, 10:15pm

Ah great, that’s what I was hoping for. I like to decouple the query from my actions so they’re easier to reuse, and also it’s more understandable I feel when the dataflow is explicit in the flow definition, instead of hidden away in the actions.

raspasov · August 2, 2021, 6:16am

A lot of in interesting ideas and approaches here!

Here’s a very short macro that I wrote (10 lines). It’s called some-as->. It basically combines the approaches of the clojure.core/some-> and clojure.core/as->. For example:

(some-as-> {:a 42} x ;x is {:a 42}
 (:a x)     ;42, x is 42                           
 (+ 1 2 x)) ;allows us to use x in any position!
;=> 45

It will short-circuit execution if any expression returns nil:

  (some-as-> {:a 42} x
   (:b x) ;this is nil
   (inc x)) ;this does not run
  ;=> nil

The macroexpansion shows exactly what’s happening:

(macroexpand-1
 '(some-as-> {:a 42} x
  (:b x)                               
  (inc x)))
;=>
(let
 [x {:a 42} 
  x (if (nil? x) nil (:b x))]
 (if (nil? x) nil (inc x)))

Compare that to using regular clojure.core/as-> which throws an exception if any of the following functions/expressions are not happy with nil:

(as-> {:a 42} x
 (:b x) ;this is nil                                
 (+ 1 2 x)) ;boom!                       
;=> ...Execution error (NullPointerException)

If anybody is interested, the source code for the macro is in this file: alexandria-clj/core.cljc at main · raspasov/alexandria-clj · GitHub

respatialized · August 3, 2021, 5:41pm

I recently experimented with using malli schemas to organize the control flow in fabricate, the static website generator I’ve been developing. I adapted the idea of a state-action behavior (as described in a paper by Leslie Lamport) by directly mapping from schemas describing the state to functions.

The primary goal was to add the ability to generate markdown files as output without needing to change the file reader or add a markdown parser, by enumerating “markdown output” as a special case. It was a sufficiently flexible method of organizing the main loop to succeed in that goal. I haven’t added exception handling yet, but exceptions could easily just be designated as states in this model, which would extend the same model of control flow to errors.

You can read about it on the fabricate github page: Organizing Computation With Finite Schema Machines. Definitely interested in feedback on this idea.

Johan · August 6, 2021, 1:27am

@wcalderipe I’m using “interceptors” to control the flow of long-running operations. When search Github I found a few projects using a similar pattern, Implementing the concept was quite simple and often each project will implement it with its own set of special features and async support.

wcalderipe · August 7, 2021, 5:58am

@tdrencak interesting approach, thanks for sharing. I’ve used a state machine to build an onboarding application a while ago but I never thought about combining it with BT.

@Johan we’ve adopted interceptors as well. In our experience, the composability of interceptors is paying off the introduction of the pattern in the codebase. Moreover, we don’t use them everywhere, only on business units. However, we’re working on a young project, and most of or flows are pretty simple yet only with short-circuit on errors.

Have you tried to implement loops (e.g., retry) with interceptors? If yes, would you mind sharing an example?