When is it appropriate to do RDD instead of TDD?

Repl-Driven Development is touted as a super-power of Clojure and is often referred to by famous Clojurians like Stuart Halloway and, to some degree, Rich Hickie. It’s usually brought up as an alternative to the verbosity/tedium/false-security of testing. However, if you were to skip testing in favor of RDD you might see major improvements in development speed at the cost of losing communicability, regression tests, CI/CD, and tests-as-documentation that are all reasons I lean heavily on TDD with my teams.

If you do RDD, when/why do you choose that over TDD? Are your projects small, or so independent you don’t need to worry about sharing code with others? Or do you have a cool way of blending RDD with TDD?

2 Likes

I blend it. While I am exploring what my function should be, I use the REPL, mainly via a (comment ...) form under the function I am developing. Then, when I start to see what it should be like I put tests into the :test entry of the attr-map of the defn form and chase the edge cases out using TDD using both the REPL and the test runner in my IDE. When I’m done with the function I consider moving some of the tests to a test namespace (and I often decide against that and just leave them all inthere in the attr-map).

4 Likes

Thanks for your answer! I have some questions.

  • Is (comment) the same as #_?
  • I’ve not heard of that :test entry of attr-map. How does that work, and is it somehow observed by some IDEs (in my case I would hope for Cider)?

I like the iterative approach you’ve mentioned; definitely RDD into regression tests.

I use (comment ,,,) for lists of expressions that I can evaluate from my editor – aka “Rich Comment Forms” (coined by Stu Halloway, I believe, because Rich Hickey uses this technique to have “dev/test” expressions saved in a source file – you can find it in some part of the Clojure source itself).

A comment form can contain multiple pieces of code and overall it “evaluates” to nil (without executing any of the code). The #_ reader macro causes a single expression to be completely ignored.

[1 2 3 (comment 4) 5 #_6 7]
;;=> [1 2 3 nil 5 7]

I use a mixture of RDD and TDD. I’ll use TDD when I have a requirements spec for a new feature that lends itself easily to writing tests, e.g., a new API endpoint for our REST service. I’ll write (failing) tests for all the error handling cases and several happy path cases, and implement code to make the tests pass, one or two tests at a time. I’ll use RDD most of the time (even when I’m also doing TDD) using comment forms to explore data structures and transforms, as I evolve code to the point it does what I need.

Some of those comment forms will also become actual tests so that we have protection against regressions.

Our code base runs about 95K lines right now. 21K of those are tests of various sorts (some UAT-style, some unit, some integration, some generative and/or property-based).

6 Likes

If you look at the docs for defn you’ll see that you can put a map between the doc string and the argument vector. And if you look at the implementation of the deftest macro, you’ll see that it puts the tests into the :test key of this map. Then that is where the test runner finds them. Which means you can put them there yourself w/o deftest, if you fancy. :smile:

It can look a bit like so:

(defn- divizable?
  "Is `n` divisable by `d`?"
  {:test (fn []
           (is (divizable? 49 7))
           (is (not (divizable? 48 7))))}
  [n d]
  (zero? (mod n d)))

(comment
  (divizable? 5 5)
  ;; => true
  (divizable? 1 0)
  ;; => Execution error (ArithmeticException) at fizzy/divizable? (anagram.clj:22).
  ;;    Divide by zero
  )

(defn fizz-buzz
  "FizzBuzz it"
  {:test (fn []
           (is (= 4
                  (fizz-buzz 4)))
           (is (= ["Buzz" 11 "Fizz" "FizzBuzz"]
                  (map fizz-buzz [10 11 12 15])))
           (is (= [1 2 "Fizz"]
                  (fizz-buzz 1 4)))
           (is (= "FizzBuzz"
                  (nth (fizz-buzz) (dec 90))))
           (is (= 100 (count (fizz-buzz)))))}
  ([n]
   (cond
     (divizable? n (* 3 5)) "FizzBuzz"
     (divizable? n (* 3)) "Fizz"
     (divizable? n (* 5)) "Buzz"
     :else n))
  ([start end]
   (map fizz-buzz (range start end)))
  ([]
   (fizz-buzz 1 101)))

(comment
  (fizz-buzz 4)
  ;; => 4
  (fizz-buzz 1 15)
  ;; => (1 2 "Fizz" 4 "Buzz" "Fizz" 7 8 "Fizz" "Buzz" 11 "Fizz" 13 14)
  (fizz-buzz))

Using Calva, I can have the cursor anywhere inside the defn form and Run Current Test and it will run these tests. I’m pretty sure CIDER has a similar feature. In Calva the default keybinding allows me to hold the ctrl and alt keys down, and then press c, t to execute this. This makes it very quick to TDD my way towards an implementation. And since tests take no arguments I can also put the cursor adjacent to the (is ...) form and Evaluate Current Sexpr/Form to run a particular assertion.

They are quite different, actually. (comment <anything>) evaluates to nil, while the reader totally skips forms tagged with #_. It matters sometimes.

Also, with Calva, the ”Rich Comments Forms”, that @seancorfield mentions, have special support in that all Evaluate Top Level … commands will regard comment as creating a new top level context. So I can put the cursor anywhere within the comment enclosed forms and execute Evaluate Top Level Form (aka defun). So it is super quick to run small experiments against my functions. (Again, I am pretty sure CIDER does similar things.). The result line comments in the above example are created using Evaluate Top Level Form (aka defun) to Comment, btw.

I also often leave some of the comment experiments in the code. And I often thank myself later for having done so. :smile:

5 Likes

Had never heard the acronym RDD before, I love it!

Regression and test-as-documentation is why I use TDD. Well, actually I don’t use TDD ever. In the sense I never write my tests first. I will RDD, and then I will go and write some tests to protect against regression and/or to help document a function.

The way I see it, to help you write bug-free code and well-factored code, RDD is like TDD on steroids. In the same time-span, I have subjected my code to way way more testing than I would have in a typical TDD flow. Similarly, I can try different ways to structure and factor the code much more quickly, as RDD really helps with exploring here, even more so than TDD again, because the feedback is so much quicker.

And when I RDD, I’m not just doing unit testing, I’m doing functional, performance, security, integration, and all sort of other type of testing and validation. All that very quickly.

TDD is good for all these things too, but RDD is just better.

Where RDD lacks, you already nailed. So that’s when I’ll complement it. No easy rule here, I always just use my judgement.

7 Likes

If you look at the docs for defn you’ll see that you can put a map between the doc string and the argument vector. And if you look at the implementation of the deftest macro, you’ll see that it puts the tests into the :test key of this map. Then that is where the test runner finds them. Which means you can put them there yourself w/o deftest , if you fancy.

FWIW, people can use with-test too.

1 Like

Do you have a video or a screencast of your RDD flow? I use RDD for experimenting with different design ideas and I try to incorporate RDD more in my development flow, but until now, I found nothing to deliver faster feedback than TDD with a continuous test runner, where I get feedback from dozens of tests for the unit I am currently developing and hundreds of tests telling me, if the unit fits in the rest of the system, all taking less than a second, just by saving the file that I edited. Since my tests are all self-contained in regards to state and setup, I find the feedback I get from them more reliable than that from a stateful REPL-Session, where I have to manually manage the state of my definitions and because of that, I get more false positive and more false negative feedback when I do RDD.

1 Like

https://github.com/cognitect-labs/transcriptor “Convert REPL interactions into example-based tests” - Stuart Halloway’s library to complement the REPL-driven-development technique.

1 Like

No I don’t have one unfortunately.

Can I ask, what do you consider a unit? And the way you talk about it, it sounds you’re running integration tests in your TDD workflow as well? To test units together?

Normally, my functions are either pure, or all they do is I/O. So I’ll write the function and as I write it, I’ll try calling it with various inputs or partially evaluating parts of it, confirming assumptions, and when I get closer to the final form, I’ll try a few edge cases I can think of, fix or alternate any issues found through that, and finally I’ll go and add some tests for it.

Since it’s a pure function, there’s not any point in the tests ever re-running unless I go change it some more.

If it’s a function that does I/O, I’ll basically do the same thing, but the tests I write at the end will be integration tests instead. I split my unit tests and my integ tests, cause I don’t run them in the same environments.

The last type of function I write I call orchestrators. They’re normally the top most functions for any use case. All they do is orchestrate the order of calls and the data between the other functions (the I/O ones and the pure ones).

I kind of use a similar approach for those as well. Though when I get to writing tests, I will sometimes write both integration and functional tests, which is what I call basically when you mock state and I/O, but still have the units inside your app all call each other.

I can see having a continuous test runner maybe useful if I were trying to do a lot of refactoring and moving things around, but still, when you refactor, it’s also normal for all the tests to break. So I would probably only have the continuous test runner running the integ and functional tests. Assuming I’m not moving the orchestrator functions themselves.

If you had a video of your TDD workflow, I’d be interested in seeing that as well.

2 Likes

Can I ask, what do you consider a unit?

With unit I refer to a part of the complete system. There is no restriction of size. Depending on the level of abstraction of the unit, it can contain multiple functions in multiple namespace (often with a single function acting as an API) or a single function in a single namespace. Instead of “unit” I maybe should have called the better known term “subject under test”: Subject · testdouble/contributing-tests Wiki · GitHub

And the way you talk about it, it sounds you’re running integration tests in your TDD workflow as well? To test units together?

Yes. And typically, this would lead to slower feedback cycles. To counter that, I have up to three test runners running in parallel with different test selectors:

  • ^:focus for a the few tests, that are predominantly relevant for part I am developing or changing, i.e. unit and integration tests for the subject under test. Usually, they complete faster than I notice.
  • ^:unit for all unit tests of the complete system. They should also complete faster than I can notice.
  • ^:integration for all integration tests that are not super slow. Even without the slow tests, these set of tests complete with a noticeable delay.

In this way, I get the most relevant feedback close to instantly and close to complete feedback with a small delay. I do not wait on that delayed feedback, instead a system notification would inform me, if I broke something that I did not expect.

Normally, my functions are either pure, or all they do is I/O. So I’ll write the function and as I write it, I’ll try calling it with various inputs or partially evaluating parts of it, confirming assumptions, and when I get closer to the final form, I’ll try a few edge cases I can think of, fix or alternate any issues found through that, and finally I’ll go and add some tests for it.

Since it’s a pure function, there’s not any point in the tests ever re-running unless I go change it some more.

That is similar to how I work, albeit I change pretty fast from a REPL workflow to a TDD workflow, so that I can hardly call it RDD. When I work in the REPL, I need to constantly reevaluate previous inputs, because I cannot be sure, that the current state of the function still works for the input data a couple seconds or minutes ago. Because I am too lazy to scroll through the REPL history or move the cursor to the comment block and reevaluate manually, I let the test runner reevaluate all my assumptions constantly. Depending on the output of a function, I also rather let the test runner compare the output with my assumptions than doing it manually by looking at it. Is cognitect-labs/transcriptor the canonical Clojure solution to the problems I have with RDD?

I can see having a continuous test runner maybe useful if I were trying to do a lot of refactoring and moving things around, but still, when you refactor, it’s also normal for all the tests to break.

This is not necessarily true as it depends completely on the approach of refactoring and on the way the tests are written. (See for example the demo below)
But this would be a topic on its own.

If you had a video of your TDD workflow, I’d be interested in seeing that as well.

I have a demo in Clojure on the specific way how I approach TDD at https://www.tddstpau.li/ Since it shows how I solved the Diamond Kata for the 20th time and is optimized to demonstrate how small TDD steps can be made, it is a rather contrived example, but it hopefully brings the idea across.

3 Likes

Yes, sometimes this is an issue, but for me, form and function come together at the same time. It’s rare that the signature of my function and its semantics are set in stone until the functionality is also working and all edge cases have been addressed.

When I go the extra mile and write tests right away, instead of just going through the REPL history, I sometimes face the opposite problem, my tests constantly break, and that’s normal, because I’m also constantly changing the interface signature and semantics.

That’s why I end up not doing that, and waiting until I’m done writing the function, and then I’ll go and write the tests for it.

I can imagine other people maybe approach the design of the function and its implementation sequentially, but everytime I tried to do it like that I failed, for me they are co-dependent, a change in the design can change my implementation and an issue in the implementation can make me reconsider my design.

That is some madman TDD :yum:. So do you first identify the functions you intend to work on, and change their tag to focus, and then go work on them?

Can I also ask what test runner you’re using for this?

I guess we differ here as well. This goes back to the same thing I said before, since I modify the shape of the output a lot as I tend to write the function itself, I find it faster to manual assert than keep going and adapting all new tests over and over.

I guess for me, when I did use to go do that, I realized I’d get lazier on the function design and factoring. I wouldn’t try as many things in that space, because of the burden to keep going modifying the tests over and over.

I don’t think I agree with this. Move a function to a different namespace and your tests are broken, change the function name and they are broken. If you factor out parts of your function into smaller functions, but keep the original function that’s fine, if you do so and get rid of the original function your tests are broken again. Refactor any of the semantics, like choose to throw instead of returning nil, or vice versa, your tests are broken. Decide to take in a map instead of positional args, tests are broken.

Basically, any non backward compatible refactor should break your tests, I mean, that’s kind of the point of the tests in the first place.

The only tests that are immune are behavioral tests. Basically tests that use the same interface given to users in order to execute and assert behavior. Well, unless you go and decide to change that as well, though some people wouldn’t count that as a refactor, and assume refactoring is any change that doesn’t change behavior.

This looks pretty cool. You should let me know what test runner you are using. I might be interested in trying it out.

I think fundamentally, TDD/RDD are not mutually exclusive and can be complimentary. I personally don’t write my tests first, but I will sometimes start to introduce tests once I’m at the point where I feel confident about the interface structure and semantics, even if I’m not done implementing. Especially in scenarios like you described, where I see myself just manually re-running exactly the same inputs over and over. I never combined that with an on-save test runner, I normally just trigger tests through Cider instead.

When looking at your demo, I didn’t see you eval anything ever. You never feel the need to try out some of the pieces you choose to use on their own, just to validate some assumptions about them, like say you can’t remember which arg of the reduce-fn passed to reduce is the accumulator? Or what happens if you map over nil?

2 Likes

Thanks for sharing, @lomin! I really enjoyed watching the video. I suspect it might be interesting for others to see as well, both how you work to always keep your code under test and your paredit usage. I suspect that a good way to learn to work with paredit could actually be to practice your motions …

2 Likes

This goes back to the same thing I said before, since I modify the shape of the output a lot as I tend to write the function itself, I find it faster to manual assert than keep going and adapting all new tests over and over.

I guess for me, when I did use to go do that, I realized I’d get lazier on the function design and factoring. I wouldn’t try as many things in that space, because of the burden to keep going modifying the tests over and over.

[…]

I think fundamentally, TDD/RDD are not mutually exclusive and can be complimentary.

I agree and I have now a better heuristic, in which situations RDD might lead to shorter feedback cycles than TDD, thanks.

Move a function to a different namespace and your tests are broken, change the function name and they are broken. If you factor out parts of your function into smaller functions, but keep the original function that’s fine, if you do so and get rid of the original function your tests are broken again. Refactor any of the semantics, like choose to throw instead of returning nil, or vice versa, your tests are broken. Decide to take in a map instead of positional args, tests are broken.

That is not necessarily so. I, for example, never break tests, because I want always be in a deployable state. If only a single test is red or broken, I cannot deploy.

If I want to move a function to a different namespace, there a two options:

  1. The function is an implementation detail and only tested implicitly by tests for other functions. Then the move is no problem and I do not break any tests.
  2. The function is tested explicitly. Then I duplicate the function in the other namespace first and then change the tests. After that, I remove the function from the original namespace. No tests are broken during that process.

In the same way, renaming a function also does not necessarily break tests. With Cursive, the IDE can do it automatically and instantly for you.

You didn’t say that and I’m rambling a little, but related, there is this opinion on the interwebs, that you cannot make big refactorings in large systems written in a dynamic typed language. I don’t know where this is coming from, because we did it all the time and had no problems at all. Deciding for a map instead of positional args for a key component at the heart of a system was exactly a part of one of these refeactorings. Again, we did that iteratively and incrementally without ever breaking a single test, developing on other parts of the system in parallel to that refactoring (no branches) and still deploy on every commit dozens of times a day. My intention is not to talk about how great we were, but to provide a counterexample to the claims that not breaking tests is impossible, or related, that big refactorings in large systems written in a dynamic typed language are impossible. I do not intend to add anything more to that than my counterexample from my last Clojure project.

This looks pretty cool. You should let me know what test runner you are using. I might be interested in trying it out.

Thanks. I use GitHub - jakemcc/test-refresh: Refreshes and reruns clojure.tests in your project..

When looking at your demo, I didn’t see you eval anything ever. You never feel the need to try out some of the pieces you choose to use on their own, just to validate some assumptions about them, like say you can’t remember which arg of the reduce-fn passed to reduce is the accumulator? Or what happens if you map over nil?

I did not use the REPL in this demo, because one, I solved the Kata a dozen times already and two, this is not a Clojure demo but a TDD demo (just happen to be be written in Clojure). In a more realistic setting, I use the REPL for all the use cases you mentioned.

2 Likes

When it is appropriate to do RDD: always.
Instead of TDD: never.

The thing is, if you want your code to not have regression bugs, you need to use TDD (or BDD, that it is what I prefer). The thing is, with RDD your cycle can be smaller: write a test that will fail, then change to RDD (eval the test, see it fail, implement part, evaluate, see it fail, and so on, until evaluating the test passes).

Some test tools can help you here: nubank/matcher-combinators prints diffs and colors when an assertion fails (and also eases the assertions of maps / vectors, specially when there’s dynamic code like an auto-generated uuid for example). Then, RDD is even easier as a sub-process of TDD, because the matcher will even tell what’s wrong.

I mentioned BDD because I think the BDD cycle is easier: write a full integration test, see it fail, write smaller tests until you’re comfortable going to the implementation. But again, RDD is only a subprocess of everything for me :slight_smile:

2 Likes

I think that no one is suggesting to skip all testing. It’s example based testing that is problematic.

I’ll divert a bit now.

A necessary disclaimer. What follows is a rant and probably a borderline provocation. It might revive a long forgotten thread. If you have just eaten it’s better not to read it, because it might revive something else.

One of the things I’ve learned by using clojure is that it’s good to be opinionated. I also know that sometimes it’s better to keep your opinion to yourself. And what stands for you stands for me too - but I don’t want to divert so much as to talk about #metoo. It’s not that I don’t approve or anything … It’ll very soon be clear, if it’s not already, that it’s Anne Elk’s way of expressing the opinion that I admire, so I might as well state it. It’s Anne Elk’s way of stating the opinion or theory that I admire. With that out of the way I might as well say it. The opinion that is, it is mine and not Anne Elk’s.

When you spot a nonsense you say it’s a nonsense. Inevitably this involves a bit of finger pointing and it might hurt some feelings. It might even be well intended or elaborate nonsense but nonetheless a nonsense is still a nonsense - a nonsensical paraphrase of the Shakespeare. And here’s another - to write it or not to write it. That’s where my knowledge of Shakespeare ends, and it’s better to go back to the subject I don’t know either.

The TDD is nonsense. Actually, all xDDs are - that’s TDD and any of it’s siblings BDD, ATDD, and now also the newly coined RDD to name the few. It’s dangerous that TDD is regarded as best practice for development when it cannot be further away from any best practice.

There are so many D’s in xDD that at least one of them is bound to stand for danger. I believe they all do (even x - after all it is a driver). Saying that development is driven by x, seems like saying that x is the solution to making development easy. There are more points available for the one who can come up with x that’s really easy to do, because xDD boils down to do x and everything will be fine. It’s funny though that xDD originated in agile world which values “Individuals and interactions over processes and tools”.

Development is simple, two part process consisting of thinking (problem solving) and typing (code writing), where typing is always a consequence of thinking. Less thinking normally leads to more typing. Typing is easy, thinking is hard. The catch-22 of xDD is that the x that would make development easy has to be about thinking rather then typing, but making x about thinking makes xDD hard. When x is about typing xDD is easy but it drives typing, rather than development. By driving typing, more code than necessary is being written and therefore more bugs are introduced. So xDDs are BBD - buggy by design.

There’s an ever present urge to separate the thinking and typing part and in this sense agile is actually the new waterfall. In waterfall the thinking is done by a designated group of intellectually superior individuals not wanting to get their fingers dirty and full of various viruses by touching the keyboard, and therefore they rightly delegate typing to - as they call them - a bunch of easily replaceable overpaid underachievers not able to follow simple and clear instructions provided, always turning the what would be the next big thing to another lost opportunity, leaving us with only viable option of outsourcing the work to others the next time. In agile the thinking is done by no one because it hurts, for no one can really predict the future and we all actually start in the same boat on a calm sea until lo and behold totally unexpectedly for no reason at all, all hell breaks loose and the sea starts to behave unpredictably, sending waves towards us and what not and then we need to react quickly and swiftly if we are to survive because we never predicted the water to turn so badly and the boat to leak - I mean how could we? we don’t do prediction exactly because of the fact that water behaves so unpredictably - luckily we’re not on a river so it’s unlikely we’re heading towards waterfall.

I believe that thinking and typing has to be done by the same person, with at least some thinking ideally away from the keyboard. When you’re at the keyboard trying to write production ready code, you’re in execution mode. Without up-front thinking, that helped you form some kind of a plan, you simply cannot execute and it’s only by accident that you’ll end up with production ready code.

Obviously I’m not fond of TDD. I’m not even fond of RDD unless it’s intended use is to help developer to explore and/or learn about unknown territories by writing a throw away code. What I normally do though, is UTDD. It doesn’t stand for ultimate test driven development. Surprisingly it doesn’t even stand for up-front thinking driven development or even up-front thinking delegate downstream. It simply stands for up-front thinking and then don’t do.

The danger of TDD is that it leads to conversations about problems that are easily solvable when it would be better to have conversations about hard problems (like design, thinking, communication). And if there is already a generation of developers raised on TDD then sooner or later the hard problems will disappear until they are rediscovered and the new xDD is introduced.

3 Likes

I respectfully disagree with your rejection of TDD – respectfully, because your thoughts were thought-provoking and valuable. I have found TDD to be highly valuable in communicating work, progress, and to actually serve as the mode of conversation referred to in your remarks. I also have the sense that my TDD might be different from what you describe when mentioning disrespect for those who are getting their hands dirty; the most DD part of my TDD is the use of use-case (story) tests to outline desired behavior before we are in a position to write any unit tests; e.g., “users should be able to review previously submitted applications.” Eventually many of our story tests get refactored into unit tests once we’ve settled upon an implementation plan, but they can be the key facilitator of conversations.

You mention the desire for conversations about hard problems – I’d like to hear more about this. It sounds like it comes from your experience; can you say something more about just what sort of conversations you’ve seen xDD bypassing?

I’m glad you’ve found a process that works for you. It seems closer to ATDD (Acceptance test DD) than TDD. Not that it’s particularly important in the context of what process works for you but when we talk about TDD and you talk about your TDD and I talk about my TDD do we really talk about TDD? Is that conversation or just talking? How to communicate effectively is one such hard problem, at least to me, and a better conversational topic among (professional) programmers than how to write code.

It’s important to have time and place available for planning in your process. You clearly have that and I’d say it’s tremendous because planning is in the thinking rather than typing part of programming. I’d like to know more about implementation plan you do. Who’s involved in planning and who’s involved in implementation? What happens if during implementation the plan is found to be nonsense? How do you prepare a plan? Do you follow certain steps or is it an improvisation? Could you also shed some light on refactoring story tests into unit tests?

2 Likes

I’m assuming that anything but green light is bad feedback. Given the bad feedback, how do you establish where the fault lies? Do you ever consider the posibility that tests are faulty? Why is the speed of feedback important other than combining speed with driving could be fun?

You were kind enough to provide code and I’ll be rude enough to scrutinize it. I wouldn’t bother if the code didn’t come from a source that could be regarded as an authority. Website is polished but the code is not. It’s not production ready.

The problem I see is that tests are coupled to implementation. There are tests that don’t call your SUT therefore change of implementation could result in a change of tests, making the code brittle. What would you say are the advantages of such approach?

You mentioned that you’ve implemented diamond kata many times. What’s the goal with each new implementation? Do you try to remember previous implementations or forget them whith each go? I suppose you start new project for new implementation. Have you considered starting a new implementation in existing project? I believe you could still do that in TDD fashion by following the RGR mantra.

Only fair thing to do now is that I also provide code that solves diamond kata, so that you could return the compliment if you so wish. I don’t claim it’s ideal or that there’s no other way the problem could be solved but I believe it’s production ready (and I don’t say that only because I think that no one would run this in production ever, but it certainly helps). There is no video because it would be boring to watch. Code is available here: GitHub - doubleelbow/diamond-kata.

2 Likes

It’s important to have time and place available for planning in your
process. You clearly have that and I’d say it’s tremendous because
planning is in the thinking rather than typing part of programming.
I’d like to know more about implementation plan you do. Who’s
involved in planning and who’s involved in implementation? What
happens if during implementation the plan is found to be nonsense?
How do you prepare a plan? Do you follow certain steps or is it an
improvisation? Could you also shed some light on refactoring story
tests into unit tests?

Yes, those are some of the reasons that our testing praxis has been effective. Working at University, My teams are usually me at the top (I have the chief technical design responsibilities) with student employees for implementation; in this case, I produce use-case tests and they can write the unit-tests and associated functions. Our projects have two branches under the tests directory: unit and use-case/stories. When the project is beginning, there are no unit tests and it’s my responsibility to write out stories that fully specify program interaction from the perspective of the different rolls (e.g. “applicants” and “admin”); these stories are highly verbose and begin with an appearance like:

(ns usecase.stories.admin
  "General process of administration of the system"
  (:require [clojure.test :refer [deftest is testing]]))

(deftest manage-users
  (testing "Admin can CRUD users"
    (is false))
  (testing "Non-admin CANNOT CRUD users"
    (is false))
  (testing "Admin reset user passwords"
    (is false)))

In order to make these tests work, functions and unit-tests will need to be written. Note that cider’s auto-testing does not work on these because, as story tests, they don’t line up 1:1 with particular namespaces in the code. In some cases, the story tests are dissolved once they’ve been realized as units, since they have served their purpose of guiding development. Often, though, they remain as the more documentation-part of our testbase, since their original role was to describe [desired] usage.