Separating effects from business logic

Over the weekend I started experimenting with structuring my ring handlers as pure functions by separating the co-/side-effects out in a style similar to what re-frame does. I.e. instead of passing in mocks of some sort during testing, you put the effectful code in some other function call and have some sort of parent orchestrator that runs the effects and passes the results into your (now pure) business logic code.

I describe my specific implementation of the approach in more detail here: Removing effects from business logic

Since I’m just now starting to try out this approach, I’m curious if anyone here has done this kind of thing on the backend extensively and what your thoughts/insights about it are.

For example, did it ultimately make testing significantly easier? Or does it feel more like “meh, passing mocks into imperative fenctions works well enough.” Did it make the business logic itself easier or harder to read/write? Did teammates have any struggles adopting the approach? Any downsides in general? Am I the last Clojurist on earth to discover this approach and in fact everyone else has been doing this all along?

So far I like the approach, somewhat subjectively, and will continue experimenting with it at least on the side and probably at work a bit. I’m just curious to peek ahead at what I might learn from my experimenting over the next year.

(also… what should I call this? I started out thinking “state machine” but am feeling a little fuzzy about if that’s the best/an accurate term)

4 Likes

I tried something that was somewhat similar: seancorfield/engine: A Clojure library to implement a query → logic → updates workflow, to separate persistence updates from business logic, to improve testing etc. (github.com) but my mistake was trying to a) make it a single pass and b) encode too much semantic information into the updates.

I don’t know how it would scale with increased complexity. The more paths through your business logic, the more passes through the same handler code and the more branches you would have in case. Add in side-effects of logging and handling potential failures of any prior side-effect. Multiply by a hundred handlers. What about exception handling?

I gave it up because it didn’t scale, for us at work. I do want to find a solution, tho’.

3 Likes

Thanks a lot for sharing and taking the time to write your blog post.

I also were on a journey to search an approach that yields more pure functions. Last year I watched David Nolen’s talk and for the first time I saw something that really felt like software lego blocks. I slightly modified David’s approach by only allowing functions that receive a map, add something and return it. I wrote about it in the readme of the world lib. The world lib is already “deprecated” in our codebase, since we discovered that we need even less tools for the described approach.

I took the time to convert your code example into the style we use to structure our code. We started to use this style since last year and are happy so far.

(defn add-fixed-url
  [w]
  (assoc w
         :fixed-url
         (identity ;; lib.rss/fix-url
           (:url w))))

(defn add-request
  [w]
  (assoc w
         :request
         {:request-method :get
          :url (:fixed-url w)
          :headers {"User-Agent" "https://yakread.com"}
          }))

(defn add-response
  [{:keys [http/client] :as w}]
  (assoc w
         :response
         (client (:request w))))

(defn parse-urls
  [params]
  ;; lib.rss/parse-urls
  (:items (:body (:response params))))

(defn add-feed-urls
  [w]
  (assoc w
         :feed-urls
         (->> w
              (parse-urls)
              (map :url) ;; mapv
              (take 20)
              vec)))

(defn add-biff-tx
  [{:keys [session] :as w}]
  (assoc w
         :biff/tx
         (for [url (:feed-urls w)]
           {:db/doc-type :conn/rss
            :db.op/upsert {:conn/user (:uid session)
                           :conn.rss/url url}
            :conn.rss/subscribed-at :db/now})))

(defn add-biff-response
  [{:keys [feed-urls] :as w}]
  (assoc w
         :biff/response
         (if (empty? feed-urls)
           {:status                     303
            :biff.response/route-name   :app.subscriptions.add/page
            :biff.response/route-params {:error "invalid-rss-feed"}}

           {:status                     303
            :biff.response/route-name   :app.subscriptions/page
            :biff.response/route-params {:added-feeds (count feed-urls)}})))

(defn transact!
  [{:keys [biff/submit-tx] :as w}]
  (assoc w
         :biff/tx-report
         (submit-tx (:biff/tx w))))

(defn prepare
  [w]
  ;; Do all side-effect free steps in the prepare phase. For the sake of this
  ;; example I consider the http request as "side-effect free" since it only
  ;; reads. However, it might be desirable to separate it (still IO that
  ;; requires retry logic).
  (-> w
      (add-fixed-url)
      (add-request)
      (add-response)
      (add-feed-urls)
      (add-biff-tx)
      (add-biff-response)))

(defn handle!
  [w]
  (-> w
      (prepare)
      (transact!)))

(comment

  ;; Rich comments test:

  (def w
    {:url "https://news.ycombinator.com/rss"
     :http/client (fn [_request]
                    ;; fake HTTP response:
                    {:status 200
                     :body
                     {:items
                      [{:url "http://www.example.com/blog/post/1"}
                       {:url "http://www.example.com/blog/post/2"}]}})
     :biff/submit-tx (fn [_tx]
                       ;; fake transaction report:
                       {:tx-date (java.util.Date.)})})

  (-> w
      (prepare))

  (handle! w)


  ;; Alternative test approach:

  (defn add-fake-response
    [w]
    (assoc w
           :response
           {:status 200
            :body
            {:items
             [{:url "http://www.example.com/blog/post/1"}
              {:url "http://www.example.com/blog/post/2"}]}}))

  (defn add-fake-tx-report
    [w]
    (assoc w
           :biff/tx-report
           {:tx-date (java.util.Date.)}))

  (def test-w
    (-> w
        ;; Re-arrange the lego blocks for the test:
        (add-fixed-url)
        (add-request)
        (add-fake-response)
        (add-feed-urls)
        (add-biff-tx)
        (add-biff-response)
        (add-fake-tx-report))
    )

  ;; do the assertions on `test-w` ...
  )

2 Likes

How do you handle (unexpected) exceptions and expected failure conditions?

The initial idea with the world lib was to wrap each an every step function in a try-catch. But while using the approach we noticed that it is sufficient to wrap critical sections into a try-catch manually and provide a more helpful error message. In the example this might be a candidate for an unexpected exception:

 (defn add-response
    [{:keys [http/client] :as w}]
    (assoc w
           :response
           (try
             (retry-logic
               (client (:request w)))
             (catch Exception e
               (throw
                 (ex-info "Fetching the RSS feed failed"
                          w
                          e))))))

The complete world-value w is added as ex-data whereby you have all intermediate results that led to the error.

An expected failure could be handled with a normal function like retry-logic above. Another example for an expected failure could be an invalid email address on a sign up form:

(defn prepare-validation
    [w]
    (-> w
        (add-email-form-param)
        (add-email-valid)))

  (defn prepare
    [w]
    (let [w* (prepare-validation w)]
      (if (:email-valid w*)
        (-> w*
            (add-success-ring-response))
        (-> w*
            (add-error-ring-response)))))

The prepare is split into multiple phases to accommodate the error handling. Normal language constructs like if are used to drive the control flow.

Thanks for that follow-up / clarification.

1 Like

@seancorfield Thanks for the link, pretty interesting (even though I don’t know that I fully understand it!).

For logging, I’d be inclined to leave it in the business logic code even if it does technically make those functions impure, since logging doesn’t affect the application behavior from the user’s standpoint. Though it also might be convenient enough to have business logic functions just include a :biff.chain/queue [:my-logger ...], :my-logger/stuff-to-log [:info "abc123"] in their results…

For unexpected exceptions I’d let them bubble up into the ring middleware as usual; for expected exceptions I’d lean towards having the effectful code return data instead and having the subsequent business logic check for failure and return early if needed (by returning an empty :biff.chain/queue). e.g. in my example code, to begin with I probably should have set :biff.chain.http/input {:throw-exceptions false, ...} and then have the ::add-urls clause check the response status:

::add-urls
(let [{:keys [status] :as output} (:biff.chain.http/output ctx)
      feed-urls (when (<= 200 status 299)
                  (->> (lib.rss/parse-urls output)
                       (mapv :url)
                       (take 20)
                       vec))]
  (if (empty? feed-urls)
    {:status                     303
     :biff.response/route-name   :app.subscriptions.add/page
     :biff.response/route-params {:error "invalid-rss-feed"}}
    ...))

The more paths through your business logic, the more passes through the same handler code and the more branches you would have in case.

I am wary of that… though it also might end up being ok. Most of our handlers at work don’t have that many effects. The most complex one that comes to mind has maybe 6 (a couple redis calls, a couple s3 calls, a couple calls to an internal service…) and most of the logic is in a single path, with the other paths being early returns. So for that handler at least I think the case form would end up pretty readable; each clause would just be “if this is true, return now, else go to the next clause”, similar to the ::add-urls clause. There may be some handlers that end up being pretty hairy, but if most of them are ok, then maybe it’s ok.

(This is a ~4-year-old, 80k-line, mostly-monolithic codebase–I’m sure there are other codebases with more effects)

However! Last night I also started musing about if some macro magic could take some imperative code and turn it into the pure case style. I’m imagining something that looks kind of like core.async–here’s the original imperative function with a hypothetical purify macro, and with the effectful bits wrapped in (effect ...):

(fn [{:keys [session
             params]
      :as ctx}]
  (purify
   (let [url (lib.rss/fix-url (:url params))
         http-response (effect (http/get url {"User-Agent" "https://yakread.com"}))
         feed-urls (->> (lib.rss/parse-urls (assoc http-response :url url))
                        (mapv :url)
                        (take 20)
                        vec)
         tx (for [url feed-urls]
              {:db/doc-type :conn/rss
               :db.op/upsert {:conn/user (:uid session)
                              :conn.rss/url url}
               :conn.rss/subscribed-at :db/now})]
     (if (empty? feed-urls)
       {:status                     303
        :biff.response/route-name   :app.subscriptions.add/page
        :biff.response/route-params {:error "invalid-rss-feed"}}
       (do
         (effect (biff/submit-tx ctx tx))
         {:status                     303
          :biff.response/route-name   :app.subscriptions/page
          :biff.response/route-params {:added-feeds (count feed-urls)}})))))

I haven’t thought about that deeply at all; there might be some obvious reason why that wouldn’t work at all. But perhaps another thing to experiment with.


@maxweber that approach looks conceptually pretty similar to what I’m proposing–basically instead of using a single function with different “states” / case clauses, you’re breaking it out into separate functions. The other big difference though is where the conditional logic goes. One thing I like about my approach is that the conditional logic gets pushed down into the pure business logic functions, so I don’t need to write a custom parent function at all (which would have effects, and which would need to be tested, which is the situation I’m trying to avoid), I just use orchestrate.

e.g. say you had a handler that made a request to service A, then based on the result it makes another request to either service B or service C (and then do more computation on the responses)? That conditional logic would need to live in handle!, wouldn’t it?

On the other hand: even if you don’t break everything down into world functions, it does sound like a potentially happy medium to not worry about separating co-effecting code (using re-frame’s terminology) from the business logic and just try to push the side effects to the end + return them as data–e.g. something like this:

(fn [{:keys [session
             params
             http-client]
      :as ctx}]
  (let [url (lib.rss/fix-url (:url params))
        http-response (http-client url {"User-Agent" "https://yakread.com"})
        feed-urls (->> (lib.rss/parse-urls (assoc http-response :url url))
                       (mapv :url)
                       (take 20)
                       vec)]
    (if (empty? feed-urls)
      {:status                     303
       :biff.response/route-name   :app.subscriptions.add/page
       :biff.response/route-params {:error "invalid-rss-feed"}}
      {:status                     303
       :biff.response/route-name   :app.subscriptions/page
       :biff.response/route-params {:added-feeds (count feed-urls)}
       :biff.side-effect/tx (for [url feed-urls]
                              {:db/doc-type :conn/rss
                               :db.op/upsert {:conn/user (:uid session)
                                              :conn.rss/url url}
                               :conn.rss/subscribed-at :db/now})})))

(off topic: just noticed I evidently have a jacob and a jacobobryant account on this site, and managed to write my two posts here from different accounts somehow…)

2 Likes

@jacobobryant the conditional logic should better live in a step function as well (similar to the prepare function in my validation example above).

In your example scenario the calls to service B and service C (else branch) could both wrapped in a try-catch that captures w as ex-data. Thereby you not only have the response of service A, but also every other intermediate result that may have led to an invalid request for service B or C. This was also my main motivation for the approach, since fixing a production bug without having the intermediate results is often very difficult, or you must be lucky that you logged the data that is required to understand the root case.

The prepare phase can plan all side-effects as data, so that they can be pushed to the end. Today I had a larger sequence of side-effects that I stored in a :program entry:

;; w:
{:program
  [
   [#'clj-http.client/request {:request-method :post ...}]
   ;; ...
   [#'datomic/transact! [{...}]
  ]
 ;; ... rest of w map entries
}

In my code’s handle! equivalent I used the following function to execute the program:

(defn execute-program!
  "Executes the `program` that is a sequence of steps. Each step is a vector where
   the first element is the function with the side-effect (function) and the rest
   are the arguments.

   Returns a sequence of maps with the `:step` and the `:result` of the
   invocation."
  [program]
  (doall
    (map
      (fn [[f & args :as step]]
        [step
         (try
           {:step step
            :result (apply f
                           args)}
           (catch Exception e
             (throw (ex-info "program step failed"
                             {:f f
                              :args args
                              :program program}
                             e))))])
      program)))

I don’t know yet if execute-program! is helpful in other cases, but it was only one of many ways to precalculate as much as possible in the prepare phase, and to push the side-effects to the edges of the system.

I’m still trying to understand what your approach looks like in cases where you can’t push side effects to the end because you have business logic that depends on the results. Let me try to come up with an (extremely contrived) example:

(defn do-stuff [{:keys [foo bar baz] :as w}]
  (let [foo (str/lower-case foo)
        foo-success (update-foo! foo)]
    (if foo-success
      (let [bar (str/lower-case bar)
            bar-success (update-bar! bar)]
        (if bar-success
          "foo succeeded and bar succeeded"
          "foo succeeded but bar failed"))
      (let [baz (str/lower-case baz)
            baz-success (update-baz! baz)]
        (if baz-success
          "foo failed but baz succeeded"
          "foo failed and baz failed")))))

Here the user submits a request with foo, bar and baz params. We do some computation on foo (just lower casing it) and then call update-foo! – if that succeeds then we do the same thing for bar; if it fails, we do the same thing for baz. At the end we return a message saying which operations we attempted and whether they succeeded or failed.

So you can’t do all the pure steps first and then the side-effecting steps last because you won’t even know whether to take the bar branch or the baz branch until after the first update-foo! side effect has taken place. handle! could be broken up into multple prepare and transaction stages, like:

(defn handle! [w]
  (-> w
      prepare-1
      transact-1!
      prepare-2
      transact-2!))

But in this particular case, if I want to maintain handle!'s structure as a fixed series of transformations (no ifs, no loop/recur, just (-> w f1 f2 f3 ...)) the only way I can think to do it is to have each subsequent prepare function check which branch it’s on:

(defn prepare-foo [w]
  (assoc w :lower-case-foo (str/lower-case (:foo w))))

(defn transact-foo! [w]
  (assoc w :foo-success (update-foo! (:lower-case-foo w))))

(defn prepare-bar-or-baz [w]
  (if (:foo-success w)
    (assoc w :lower-case-bar (str/lower-case (:bar w)))
    (assoc w :lower-case-baz (str/lower-case (:baz w)))))

(defn maybe-transact-bar! [w]
  (cond-> w
    (:lower-case-bar w) (assoc :bar-success (update-bar! (:lower-case-bar w)))))

(defn maybe-transact-baz! [w]
  (cond-> w
    (:lower-case-baz w) (assoc :baz-success (update-baz! (:lower-case-baz w)))))

(defn prepare-message [w]
  (assoc w
         :message
         (if (:foo-success w)
           (if (:bar-success w)
             "foo succeeded and bar succeeded"
             "foo succeeded but bar failed")
           (if (:baz-success w)
             "foo failed but baz succeeded"
             "foo failed and baz failed"))))

(defn handle! [w]
  (-> w
      prepare-foo
      transact-foo!
      prepare-bar-or-baz
      maybe-transact-bar!
      maybe-transact-baz!
      prepare-message))

Which seems unwieldy. e.g. in this simple example I had to duplicate the (if (:foo-success w) ...) check in both prepare-bar-or-baz and prepare-message.

If we addapted this to be more like my approach with the (case state ...) thing, the prepare-bar-or-baz function would add some data to w signaling which of the other functions should be run next, like:

(defn prepare-bar-or-baz [w]
  (if (:foo-success w)
    (assoc w
           :lower-case-bar (str/lower-case (:bar w))
           :next [transact-bar! prepare-bar-message])
    (assoc w
           :lower-case-baz (str/lower-case (:baz w))
           :next [transact-baz! prepare-baz-message])))

And then handle! would look at the :next key and run those functions next, similar to what orchestrate does.

Anyway: again, maybe this isn’t that big of a deal since in most cases you probably can push all the side effects to the end. I just want to double check, am I understanding your approach? Is my example with prepare-foo etc above the way you would write it?

A few times in our codebase I came across such deeply nested if statements that depended on the results of impure functions. I experimented with a similar technique like yours. The step functions could add a function (or #'var) as :w/next to w. An orchestrate like function would then call the :w/next function with the returned w as argument, which itself could assoc another :w/next. It was quite interesting since you could switch orchestrate against a debugger that would allow a developer to execute the program step by step. But in the end it felt like too much “magic”, a little bit like building an interpreter.

However, I just like to have something that somehow captures the intermediate results, so that I can look at the data instead of simulating the program code in my brain. I think most people will feel uncomfortable at first when they pass around large maps. But the important part is to only add new entries and never modify existing ones. Otherwise it feels like global mutual state very quickly. And tools like portal make dealing with large maps a lot more comfortable.

I’m writing from my mobile. But tomorrow or Monday I will experiment with the code examples you provided.

1 Like

That makes a lot of sense. Thanks for all the explanations!

1 Like

I believe this is a pattern usually described as “functional core, imperative shell”.

The are some articles in the Clojure community about this topic:

And even some libraries that attempt to generalize the concept:

As I understand, the general problem is that, with increasingly complex use cases (e.g. conditional handling of side-effect results, conditional data-fetching, etc), the interpreter (the function that takes the side effect descriptions and executed them) and the side-effect-describing DSL need to become increasingly powerful to include conditions, asynchronicity, etc.

1 Like

Nice–that fonda repo looks conceptually very similar to my proposal, but with async support (perhaps relatedly, I noticed the readme example is in cljs). Also independently I was thinking that “pipeline” would be a good term to use!

The two blog posts both recommend passing in side-effecting code as functions in the params which can then be mocked out during testing, which is a pretty significant difference. i.e. agreed functional-core-imperative-shell is the ideal we’re all trying to get towards; there are just differences in how exactly we go about that…

Anyway I was reading a few other articles on the topic, specifically “dependency rejection” which clarified what is IMO a benefit of the “pipeline” approach vs. using dependencies/mockable function params: the input you pass to the dependencies is “indirect output” (as he says) of your business logic. If you turn them into “direct output” (include them in the return value) then your tests can check they’re correct, whereas if you pass in a mock, your tests might not catch it if you accidentally screw up the mock’s input params.

It did also occur to me that my example code in the blog post could be simplified a bit–there isn’t actually a need for the ::success state. Since the :biff.chain/tx function passes along its input value, the ::add-urls state can just be:

::add-urls
(let [feed-urls ...]
  (if (empty? feed-urls)
    ...
    {:biff.chain/queue    [:biff.chain/tx]
     :biff.chain.tx/input (vec
                           (for [url feed-urls]
                             {:db/doc-type :conn/rss
                              :db.op/upsert {:conn/user (:uid session)
                                             :conn.rss/url url}
                              :conn.rss/subscribed-at :db/now}))
     :status                     303
     :biff.response/route-name   :app.subscriptions/page
     :biff.response/route-params {:added-feeds (count feed-urls)}}))

i.e. for side effects that can be pushed to the end/where you don’t need the output, you don’t have to introduce additional states, which should help to keep functions in this style readable.

Overall I’m feeling pretty good about the pipeline approach–I’ll probably mess around with the API and then see how it goes when used throughout the rest of the app.

This is why I prefer pedestal and the interceptor model.

1 Like

instead of passing in mocks of some sort during testing, you put the effectful code in some other function call and have some sort of parent orchestrator that runs the effects and passes the results into your (now pure) business logic code.

I write most of my backend code this way.

The top-level functions are orchestrating between pure and impure. They don’t make side-effects; they call other functions that do. They don’t have business logic, they call other functions that do. They only orchestrate between pure and impure.

Then my pure functions do all the business logic. And my impure functions the side-effects.

The only functions that are a mix of pure/impure are those top-level orchestration functions. Those one I might test by mocking the impure functions, to test the orchestration, or I might just have integ tests only.

If you have some pure logic that depends on impure, it would be encoded in those top-level. Also, they are the only functions that are also allowed to access globals (though you could make them into a Component and inject those if you cared).

(defn do-x
  [input]
  (let [queried-z (get-z (:something input) (:something-else input))
        data-z (commit db queried-z)]
    (if (should-y? data-z)
      (transform data-z)
      data-z)))
  • do-x references the global db stateful resource.
  • do-x orchestrates between pure and impure, but has no logic apart from the branching conditions
  • do-x does not include the clauses, a pure function is used for that, the should-y? function is where the rules for when to transform the data are implemented, not in do-x
  • get-z is a pure function, it returns a command that allows you to get z, but does not actually commit the side-effect
  • commit will use that command to apply the side-effect and actually fetch the z data from the DB
1 Like

With your example function, I guess it would look like this:

(def rss-route
  ["/dev/subscriptions/add/rss"
   {:name :app.subscriptions.add/rss
    :post (fn [{:keys [session params]
                {:keys [uid]} :session
                {:keys [url]} :params
                :as ctx}]
            (handle-add-rss-sub ctx url uid))}])

(defn extract-feed-urls-from
  [http-response url]
  (->> (lib.rss/parse-urls (assoc http-response :url url))
       (mapv :url)
       (take 20)
       vec))

(defn retrieve-feed-urls
  [url]
  {:ok :feed-urls-retrieved
   :url url
   :options {"User-Agent" "https://yakread.com"}
   :query #(extract-feed-urls-from % url)})

(defn commit-feed-urls-retrieved
  [{:keys [url options query]}]
  (-> url (http/get options) query))

(defn fix-and-retrieve-feed-urls
  [unfixed-url]
  (-> unfixed-url
      (lib.rss/fix-url)
      (retrieve-feed-urls)))

(defn add-rss-sub
  [feed-urls uid]
  (if (empty? feed-urls)
    {:error :invalid-rss-feed
     :msg "invalid-rss-feed"}
    {:ok :rss-sub-added
     :query (for [url feed-urls]
              {:db/doc-type :conn/rss
               :db.op/upsert {:conn/user uid
                              :conn.rss/url url}
               :conn.rss/subscribed-at :db/now})
     :added-count (count feed-urls)}))

(defn commit-rss-sub-added
  [ctx {:keys [query]}]
  (biff/submit-tx ctx query))

(defn handle-add-rss-sub
  [ctx url uid]
  (let [feed-urls-retrieved (fix-and-retrieve-feed-urls url)
        feed-urls (commit-feed-urls-retrieved feed-urls-retrieved)
        rss-sub-added (add-rss-sub feed-urls uid)]
    (if (:ok rss-sub-added)
      (do (commit-rss-sub-added ctx rss-sub-added)
          {:status 303
           :biff.response/route-name :app.subscriptions/page
           :biff.response/route-params {:added-feeds (:added-count rss-sub-added)}})
      {:status 303
       :biff.response/route-name :app.subscriptions.add/page
       :biff.response/route-params {:error (:msg rss-sub-added)}})))