When is it appropriate to do RDD instead of TDD?

Repl-Driven Development is touted as a super-power of Clojure and is often referred to by famous Clojurians like Stuart Halloway and, to some degree, Rich Hickie. It’s usually brought up as an alternative to the verbosity/tedium/false-security of testing. However, if you were to skip testing in favor of RDD you might see major improvements in development speed at the cost of losing communicability, regression tests, CI/CD, and tests-as-documentation that are all reasons I lean heavily on TDD with my teams.

If you do RDD, when/why do you choose that over TDD? Are your projects small, or so independent you don’t need to worry about sharing code with others? Or do you have a cool way of blending RDD with TDD?

1 Like

I blend it. While I am exploring what my function should be, I use the REPL, mainly via a (comment ...) form under the function I am developing. Then, when I start to see what it should be like I put tests into the :test entry of the attr-map of the defn form and chase the edge cases out using TDD using both the REPL and the test runner in my IDE. When I’m done with the function I consider moving some of the tests to a test namespace (and I often decide against that and just leave them all inthere in the attr-map).

3 Likes

Thanks for your answer! I have some questions.

  • Is (comment) the same as #_?
  • I’ve not heard of that :test entry of attr-map. How does that work, and is it somehow observed by some IDEs (in my case I would hope for Cider)?

I like the iterative approach you’ve mentioned; definitely RDD into regression tests.

I use (comment ,,,) for lists of expressions that I can evaluate from my editor – aka “Rich Comment Forms” (coined by Stu Halloway, I believe, because Rich Hickey uses this technique to have “dev/test” expressions saved in a source file – you can find it in some part of the Clojure source itself).

A comment form can contain multiple pieces of code and overall it “evaluates” to nil (without executing any of the code). The #_ reader macro causes a single expression to be completely ignored.

[1 2 3 (comment 4) 5 #_6 7]
;;=> [1 2 3 nil 5 7]

I use a mixture of RDD and TDD. I’ll use TDD when I have a requirements spec for a new feature that lends itself easily to writing tests, e.g., a new API endpoint for our REST service. I’ll write (failing) tests for all the error handling cases and several happy path cases, and implement code to make the tests pass, one or two tests at a time. I’ll use RDD most of the time (even when I’m also doing TDD) using comment forms to explore data structures and transforms, as I evolve code to the point it does what I need.

Some of those comment forms will also become actual tests so that we have protection against regressions.

Our code base runs about 95K lines right now. 21K of those are tests of various sorts (some UAT-style, some unit, some integration, some generative and/or property-based).

2 Likes

If you look at the docs for defn you’ll see that you can put a map between the doc string and the argument vector. And if you look at the implementation of the deftest macro, you’ll see that it puts the tests into the :test key of this map. Then that is where the test runner finds them. Which means you can put them there yourself w/o deftest, if you fancy. :smile:

It can look a bit like so:

(defn- divizable?
  "Is `n` divisable by `d`?"
  {:test (fn []
           (is (divizable? 49 7))
           (is (not (divizable? 48 7))))}
  [n d]
  (zero? (mod n d)))

(comment
  (divizable? 5 5)
  ;; => true
  (divizable? 1 0)
  ;; => Execution error (ArithmeticException) at fizzy/divizable? (anagram.clj:22).
  ;;    Divide by zero
  )

(defn fizz-buzz
  "FizzBuzz it"
  {:test (fn []
           (is (= 4
                  (fizz-buzz 4)))
           (is (= ["Buzz" 11 "Fizz" "FizzBuzz"]
                  (map fizz-buzz [10 11 12 15])))
           (is (= [1 2 "Fizz"]
                  (fizz-buzz 1 4)))
           (is (= "FizzBuzz"
                  (nth (fizz-buzz) (dec 90))))
           (is (= 100 (count (fizz-buzz)))))}
  ([n]
   (cond
     (divizable? n (* 3 5)) "FizzBuzz"
     (divizable? n (* 3)) "Fizz"
     (divizable? n (* 5)) "Buzz"
     :else n))
  ([start end]
   (map fizz-buzz (range start end)))
  ([]
   (fizz-buzz 1 101)))

(comment
  (fizz-buzz 4)
  ;; => 4
  (fizz-buzz 1 15)
  ;; => (1 2 "Fizz" 4 "Buzz" "Fizz" 7 8 "Fizz" "Buzz" 11 "Fizz" 13 14)
  (fizz-buzz))

Using Calva, I can have the cursor anywhere inside the defn form and Run Current Test and it will run these tests. I’m pretty sure CIDER has a similar feature. In Calva the default keybinding allows me to hold the ctrl and alt keys down, and then press c, t to execute this. This makes it very quick to TDD my way towards an implementation. And since tests take no arguments I can also put the cursor adjacent to the (is ...) form and Evaluate Current Sexpr/Form to run a particular assertion.

They are quite different, actually. (comment <anything>) evaluates to nil, while the reader totally skips forms tagged with #_. It matters sometimes.

Also, with Calva, the ”Rich Comments Forms”, that @seancorfield mentions, have special support in that all Evaluate Top Level … commands will regard comment as creating a new top level context. So I can put the cursor anywhere within the comment enclosed forms and execute Evaluate Top Level Form (aka defun). So it is super quick to run small experiments against my functions. (Again, I am pretty sure CIDER does similar things.). The result line comments in the above example are created using Evaluate Top Level Form (aka defun) to Comment, btw.

I also often leave some of the comment experiments in the code. And I often thank myself later for having done so. :smile:

4 Likes

Had never heard the acronym RDD before, I love it!

Regression and test-as-documentation is why I use TDD. Well, actually I don’t use TDD ever. In the sense I never write my tests first. I will RDD, and then I will go and write some tests to protect against regression and/or to help document a function.

The way I see it, to help you write bug-free code and well-factored code, RDD is like TDD on steroids. In the same time-span, I have subjected my code to way way more testing than I would have in a typical TDD flow. Similarly, I can try different ways to structure and factor the code much more quickly, as RDD really helps with exploring here, even more so than TDD again, because the feedback is so much quicker.

And when I RDD, I’m not just doing unit testing, I’m doing functional, performance, security, integration, and all sort of other type of testing and validation. All that very quickly.

TDD is good for all these things too, but RDD is just better.

Where RDD lacks, you already nailed. So that’s when I’ll complement it. No easy rule here, I always just use my judgement.

5 Likes

If you look at the docs for defn you’ll see that you can put a map between the doc string and the argument vector. And if you look at the implementation of the deftest macro, you’ll see that it puts the tests into the :test key of this map. Then that is where the test runner finds them. Which means you can put them there yourself w/o deftest , if you fancy.

FWIW, people can use with-test too.

1 Like

Do you have a video or a screencast of your RDD flow? I use RDD for experimenting with different design ideas and I try to incorporate RDD more in my development flow, but until now, I found nothing to deliver faster feedback than TDD with a continuous test runner, where I get feedback from dozens of tests for the unit I am currently developing and hundreds of tests telling me, if the unit fits in the rest of the system, all taking less than a second, just by saving the file that I edited. Since my tests are all self-contained in regards to state and setup, I find the feedback I get from them more reliable than that from a stateful REPL-Session, where I have to manually manage the state of my definitions and because of that, I get more false positive and more false negative feedback when I do RDD.

https://github.com/cognitect-labs/transcriptor “Convert REPL interactions into example-based tests” - Stuart Halloway’s library to complement the REPL-driven-development technique.

1 Like

No I don’t have one unfortunately.

Can I ask, what do you consider a unit? And the way you talk about it, it sounds you’re running integration tests in your TDD workflow as well? To test units together?

Normally, my functions are either pure, or all they do is I/O. So I’ll write the function and as I write it, I’ll try calling it with various inputs or partially evaluating parts of it, confirming assumptions, and when I get closer to the final form, I’ll try a few edge cases I can think of, fix or alternate any issues found through that, and finally I’ll go and add some tests for it.

Since it’s a pure function, there’s not any point in the tests ever re-running unless I go change it some more.

If it’s a function that does I/O, I’ll basically do the same thing, but the tests I write at the end will be integration tests instead. I split my unit tests and my integ tests, cause I don’t run them in the same environments.

The last type of function I write I call orchestrators. They’re normally the top most functions for any use case. All they do is orchestrate the order of calls and the data between the other functions (the I/O ones and the pure ones).

I kind of use a similar approach for those as well. Though when I get to writing tests, I will sometimes write both integration and functional tests, which is what I call basically when you mock state and I/O, but still have the units inside your app all call each other.

I can see having a continuous test runner maybe useful if I were trying to do a lot of refactoring and moving things around, but still, when you refactor, it’s also normal for all the tests to break. So I would probably only have the continuous test runner running the integ and functional tests. Assuming I’m not moving the orchestrator functions themselves.

If you had a video of your TDD workflow, I’d be interested in seeing that as well.

1 Like

Can I ask, what do you consider a unit?

With unit I refer to a part of the complete system. There is no restriction of size. Depending on the level of abstraction of the unit, it can contain multiple functions in multiple namespace (often with a single function acting as an API) or a single function in a single namespace. Instead of “unit” I maybe should have called the better known term “subject under test”: https://github.com/testdouble/contributing-tests/wiki/Subject

And the way you talk about it, it sounds you’re running integration tests in your TDD workflow as well? To test units together?

Yes. And typically, this would lead to slower feedback cycles. To counter that, I have up to three test runners running in parallel with different test selectors:

  • ^:focus for a the few tests, that are predominantly relevant for part I am developing or changing, i.e. unit and integration tests for the subject under test. Usually, they complete faster than I notice.
  • ^:unit for all unit tests of the complete system. They should also complete faster than I can notice.
  • ^:integration for all integration tests that are not super slow. Even without the slow tests, these set of tests complete with a noticeable delay.

In this way, I get the most relevant feedback close to instantly and close to complete feedback with a small delay. I do not wait on that delayed feedback, instead a system notification would inform me, if I broke something that I did not expect.

Normally, my functions are either pure, or all they do is I/O. So I’ll write the function and as I write it, I’ll try calling it with various inputs or partially evaluating parts of it, confirming assumptions, and when I get closer to the final form, I’ll try a few edge cases I can think of, fix or alternate any issues found through that, and finally I’ll go and add some tests for it.

Since it’s a pure function, there’s not any point in the tests ever re-running unless I go change it some more.

That is similar to how I work, albeit I change pretty fast from a REPL workflow to a TDD workflow, so that I can hardly call it RDD. When I work in the REPL, I need to constantly reevaluate previous inputs, because I cannot be sure, that the current state of the function still works for the input data a couple seconds or minutes ago. Because I am too lazy to scroll through the REPL history or move the cursor to the comment block and reevaluate manually, I let the test runner reevaluate all my assumptions constantly. Depending on the output of a function, I also rather let the test runner compare the output with my assumptions than doing it manually by looking at it. Is cognitect-labs/transcriptor the canonical Clojure solution to the problems I have with RDD?

I can see having a continuous test runner maybe useful if I were trying to do a lot of refactoring and moving things around, but still, when you refactor, it’s also normal for all the tests to break.

This is not necessarily true as it depends completely on the approach of refactoring and on the way the tests are written. (See for example the demo below)
But this would be a topic on its own.

If you had a video of your TDD workflow, I’d be interested in seeing that as well.

I have a demo in Clojure on the specific way how I approach TDD at https://www.tddstpau.li/ Since it shows how I solved the Diamond Kata for the 20th time and is optimized to demonstrate how small TDD steps can be made, it is a rather contrived example, but it hopefully brings the idea across.

2 Likes

Yes, sometimes this is an issue, but for me, form and function come together at the same time. It’s rare that the signature of my function and its semantics are set in stone until the functionality is also working and all edge cases have been addressed.

When I go the extra mile and write tests right away, instead of just going through the REPL history, I sometimes face the opposite problem, my tests constantly break, and that’s normal, because I’m also constantly changing the interface signature and semantics.

That’s why I end up not doing that, and waiting until I’m done writing the function, and then I’ll go and write the tests for it.

I can imagine other people maybe approach the design of the function and its implementation sequentially, but everytime I tried to do it like that I failed, for me they are co-dependent, a change in the design can change my implementation and an issue in the implementation can make me reconsider my design.

That is some madman TDD :yum:. So do you first identify the functions you intend to work on, and change their tag to focus, and then go work on them?

Can I also ask what test runner you’re using for this?

I guess we differ here as well. This goes back to the same thing I said before, since I modify the shape of the output a lot as I tend to write the function itself, I find it faster to manual assert than keep going and adapting all new tests over and over.

I guess for me, when I did use to go do that, I realized I’d get lazier on the function design and factoring. I wouldn’t try as many things in that space, because of the burden to keep going modifying the tests over and over.

I don’t think I agree with this. Move a function to a different namespace and your tests are broken, change the function name and they are broken. If you factor out parts of your function into smaller functions, but keep the original function that’s fine, if you do so and get rid of the original function your tests are broken again. Refactor any of the semantics, like choose to throw instead of returning nil, or vice versa, your tests are broken. Decide to take in a map instead of positional args, tests are broken.

Basically, any non backward compatible refactor should break your tests, I mean, that’s kind of the point of the tests in the first place.

The only tests that are immune are behavioral tests. Basically tests that use the same interface given to users in order to execute and assert behavior. Well, unless you go and decide to change that as well, though some people wouldn’t count that as a refactor, and assume refactoring is any change that doesn’t change behavior.

This looks pretty cool. You should let me know what test runner you are using. I might be interested in trying it out.

I think fundamentally, TDD/RDD are not mutually exclusive and can be complimentary. I personally don’t write my tests first, but I will sometimes start to introduce tests once I’m at the point where I feel confident about the interface structure and semantics, even if I’m not done implementing. Especially in scenarios like you described, where I see myself just manually re-running exactly the same inputs over and over. I never combined that with an on-save test runner, I normally just trigger tests through Cider instead.

When looking at your demo, I didn’t see you eval anything ever. You never feel the need to try out some of the pieces you choose to use on their own, just to validate some assumptions about them, like say you can’t remember which arg of the reduce-fn passed to reduce is the accumulator? Or what happens if you map over nil?

2 Likes

Thanks for sharing, @lomin! I really enjoyed watching the video. I suspect it might be interesting for others to see as well, both how you work to always keep your code under test and your paredit usage. I suspect that a good way to learn to work with paredit could actually be to practice your motions …

2 Likes

This goes back to the same thing I said before, since I modify the shape of the output a lot as I tend to write the function itself, I find it faster to manual assert than keep going and adapting all new tests over and over.

I guess for me, when I did use to go do that, I realized I’d get lazier on the function design and factoring. I wouldn’t try as many things in that space, because of the burden to keep going modifying the tests over and over.

[…]

I think fundamentally, TDD/RDD are not mutually exclusive and can be complimentary.

I agree and I have now a better heuristic, in which situations RDD might lead to shorter feedback cycles than TDD, thanks.

Move a function to a different namespace and your tests are broken, change the function name and they are broken. If you factor out parts of your function into smaller functions, but keep the original function that’s fine, if you do so and get rid of the original function your tests are broken again. Refactor any of the semantics, like choose to throw instead of returning nil, or vice versa, your tests are broken. Decide to take in a map instead of positional args, tests are broken.

That is not necessarily so. I, for example, never break tests, because I want always be in a deployable state. If only a single test is red or broken, I cannot deploy.

If I want to move a function to a different namespace, there a two options:

  1. The function is an implementation detail and only tested implicitly by tests for other functions. Then the move is no problem and I do not break any tests.
  2. The function is tested explicitly. Then I duplicate the function in the other namespace first and then change the tests. After that, I remove the function from the original namespace. No tests are broken during that process.

In the same way, renaming a function also does not necessarily break tests. With Cursive, the IDE can do it automatically and instantly for you.

You didn’t say that and I’m rambling a little, but related, there is this opinion on the interwebs, that you cannot make big refactorings in large systems written in a dynamic typed language. I don’t know where this is coming from, because we did it all the time and had no problems at all. Deciding for a map instead of positional args for a key component at the heart of a system was exactly a part of one of these refeactorings. Again, we did that iteratively and incrementally without ever breaking a single test, developing on other parts of the system in parallel to that refactoring (no branches) and still deploy on every commit dozens of times a day. My intention is not to talk about how great we were, but to provide a counterexample to the claims that not breaking tests is impossible, or related, that big refactorings in large systems written in a dynamic typed language are impossible. I do not intend to add anything more to that than my counterexample from my last Clojure project.

This looks pretty cool. You should let me know what test runner you are using. I might be interested in trying it out.

Thanks. I use https://github.com/jakemcc/lein-test-refresh.

When looking at your demo, I didn’t see you eval anything ever. You never feel the need to try out some of the pieces you choose to use on their own, just to validate some assumptions about them, like say you can’t remember which arg of the reduce-fn passed to reduce is the accumulator? Or what happens if you map over nil?

I did not use the REPL in this demo, because one, I solved the Kata a dozen times already and two, this is not a Clojure demo but a TDD demo (just happen to be be written in Clojure). In a more realistic setting, I use the REPL for all the use cases you mentioned.

2 Likes

When it is appropriate to do RDD: always.
Instead of TDD: never.

The thing is, if you want your code to not have regression bugs, you need to use TDD (or BDD, that it is what I prefer). The thing is, with RDD your cycle can be smaller: write a test that will fail, then change to RDD (eval the test, see it fail, implement part, evaluate, see it fail, and so on, until evaluating the test passes).

Some test tools can help you here: nubank/matcher-combinators prints diffs and colors when an assertion fails (and also eases the assertions of maps / vectors, specially when there’s dynamic code like an auto-generated uuid for example). Then, RDD is even easier as a sub-process of TDD, because the matcher will even tell what’s wrong.

I mentioned BDD because I think the BDD cycle is easier: write a full integration test, see it fail, write smaller tests until you’re comfortable going to the implementation. But again, RDD is only a subprocess of everything for me :slight_smile:

2 Likes