When to embrace redundancy?

EDIT: This thread is about stability and the cost of stability in general system development, rather than redundant systems as often in the context of HA/distributed systems. I’m leaving the title intact as the original so that all the replies can be understood in context.

Redundancy is something we have a love-hate relationship with; we hate the boilerplate and redundancy of Java, but we need redundant-fallback in our systems. I feel this particularly vexed in Clojure, where we place a high premium on developer experience and code simplicity (hence dynamic/gradual typing). Yet I’m sure you can think of lots of personal encounters this sort of dichotomy between being concise and being robust. If you are a project manager/designer, what is your method for deciding when to say “yes” or “no” to redundancy? Here are some that I am encountering most recently:

  • Use of specs and asserts to validate inputs on internal APIs
  • Whether to provide SSR versions of SPA pages (e.g. if Javascript is disabled)
  • Use of development databases on PC, in Docker, and/or on a development DB server
  • How many tests to write – use-case stories, full coverage unit tests, handler (route) tests, generative tests, etc

What are your personal methods for drawing the line on how much redundancy is too much?

3 Likes

Generally, I think you should be able to move with speed and confidence. If you’re not confident that your system works right, go slower.

A little bit of rewording, if you don’t mind: (stability, reduntancy, predictability, immutability) are opposed to (dynamism, speed, flexibilty, movement).

When we’re developing a system, figuring the right speed with which to change a system might be the most important thing we do. How much dynamism is required? How much dynamism can we handle while still delivering on correctness? That amount is going to increase with your experience – if you learn.

I’ve written about balancing speed and stability before (in Norwegian), but here are some bullet points:

  • Banning change doesn’t work, but we need stability to think. If everything moves under our feet at every moment in time, we’re lost . Think Datomic and immutability.
  • Systems with multiple moving parts have a larger requirement for a stable core. As your system grows more complex, the need of stability will rise. The more moving parts you have, the more you need stable ground.
  • As software developers we fight chaos by building tools. And tools don’t necessarily rigidify systems. (think Clojure spec; tool vs static typing; discipline and requirements)

Let’s fight chaos by building systems to create layers of stability.

Hope you feel that didn’t go too far off track!

Teodor

4 Likes

I like your observations, but could you give some more concrete use-cases? For example, when have you decided to make the more stable/redundant solution, and what triggered that decision?

1 Like

Another side to this issue complex is that redundancy requires stability (in the sense that the more unstable boundaries become, the more redundancy will bite you). If we only go by where we as developers want/don’t mind redundancy, it’s a dead end – verbosity can only have benefits where you don’t need to touch “all the things” should something change.

Translating to a practical bottom line, I’d only opt for redundancy where I can be sure of stable boundaries. In practice, though, stable boundaries might not be apparent from the get go. Clojure’s design principles strongly encourage thinking about your codebase as a growing/living corpus of functionality, so it’s probably a thing to constantly watch out for: are there verbose/redundant aspects of the codebase which regularly get “out of sync”? Are there any aspects which could benefit from redundancy?

At this point I think it becomes apparent that we’ll need to think about these benefits we’re hoping redundancy to bring. What are these benefits? Is it developer confidence, or convenience? Is redundancy inherently more robust?

1 Like

The devil is in the details!

This winter I’ve been working on a system for transforming structural engineering models from one format to another. In the beginning, I persued reuse aggressively. This resulted in a codebase with many layers, but the layers weren’t obvious, so I ended up going around them. I later refactored into;

  1. Components on top – where each component provided a user function
  2. Single-responsibility utility functions below.

Disadvantages with this approach:

  • I duplicated code. I no longer tried to abstract away everything I could.

Advantages:

  • Adding functionality to the system became simpler, as I knew that I was going to create one top-level component for each “piece of functionality” I added
  • Clear separation between reusable code (“library”) and non-reusable code (“components, functionality”). Apply testing to the library.
  • Less “moving everything around all the time” in the codebase.

I think it would help here to talk about redundancy as a strategy to achieve fault-tolerant systems. Given a fault-free environment, one where nothing can go wrong, you wouldn’t need redundancy.

If you frame it like that, you can make cost analysis. For example, what is the extra effort in adding a specific form of redundancy, what is the cost of the added complexity to maintain it, and keep it in sync, and what is the value that it will provide, i.e., the type of fault it will protect against, their cost if they were to occur, and the chances and frequency at which they do occur.

Now, there’s no easy way to perform this analysis, I’d recommend simply doing a gut check. Intuitively, you should be able to pretty quickly assess the cost of the redundancy, just by thinking about it a little. You can also go about it re-actively, where if an issue has occurred enough times, or has occurred and caused large problems, as a result, you can take in a project to add redundancy to that particular case. For other things, you can be more pro-active, if in your experience, you anticipate these problems to happen more often, and already know their cost will be high.

And I quote Wikipedia:

Providing fault-tolerant design for every component is normally not an option. Associated redundancy brings a number of penalties: increase in weight, size, power consumption, cost, as well as time to design, verify, and test. Therefore, a number of choices have to be examined to determine which components should be fault tolerant:[7]

    How critical is the component? In a car, the radio is not critical, so this component has less need for fault tolerance.
    How likely is the component to fail? Some components, like the drive shaft in a car, are not likely to fail, so no fault tolerance is needed.
    How expensive is it to make the component fault tolerant? Requiring a redundant car engine, for example, would likely be too expensive both economically and in terms of weight and space, to be considered.

For example:

  • Use of specs and asserts to validate inputs on internal APIs

I’ll assume nothing else is validating this, for example, clients do not validate when making the call. Now, if the API performs dangerous side-effects that cannot gracefully fail, or be retried, such as moving money around, or deleting records from a database. Then adding validation in order to handle faulty input seems pretty worth it. Wouldn’t want to accidentally disburse too much money, or delete the wrong table. I’m not too sure that classifies as redundancy though, but it is still a good strategy in order to tolerate faulty input.

  • Whether to provide SSR versions of SPA pages (e.g. if Javascript is disabled)

Again, what does the JavaScript do? Is it required to allow the user to submit the order? Or does it only do a cool animation on the “Buy now” button. Given that, you can relatively quickly assess the value in building redundant SSR rendering.

  • Use of development databases on PC, in Docker, and/or on a development DB server

I’ll assume you mean as opposed to using the prod DB directly in development? The first risk here is security and privacy related. Giving devs access to real user’s data could be a privacy issue. Similarly, that access might be opening insecure channels into the data, which can be easily abused and make the user’s data more easily compromisable. Another risk is that of accidentally corrupting the prod data, or bringing down the database. For this, my gut check says you always need a devo DB with fake user data, unless your production is not for paying customers, or for running a business. Like if you’re just hosting your own data for yourself, or a private game server which you use to play with your friends only, go ahead, but if this is handling paying user’s data, no go in my book.

  • How many tests to write – use-case stories, full coverage unit tests, handler (route) tests, generative tests, etc

Also not sure these count as redundancy. In fact, I don’t think these would even count as strategies to implement fault-tolerance. This seems more like strategies to achieve fault-avoidance, in order to build systems less likely to have fault in the first place, not just handling the faults gracefully when they do occur. For this, I will simply link to the amazingly wise grand-master programmer Testivus: https://testing.googleblog.com/2010/07/code-coverage-goal-80-and-no-less.html

5 Likes

Love the link @didibus – thanks!

How about you, @Webdev_Tory? Is there a specific situation that leads you to think about this right now?

Personally, I’m naturally pushed towards speed. But the times when I find it acceptable to slow down are when I accept other requirements. When you have no users in the system, it doesn’t really matter if it’s secure. But once your users are in there, you’ll need to take care of them. Suddenly, ensuring that you keep all data during a database migration becomes motivating.

1 Like

In my case, I have projects with varying (but always modest) clientele, but with longevity concerns that span decades. I leverage student teams where I am the one teaching them Clojure. In one of the projects that provoked this, we are replacing a 12-year-old PHP program which is still integral to the work one team is responsible for; the program performs a variety of functions (from trouble-tickets to hardware tracking to report-generation) and we want our replacement to have no less longevity than the original did (although the original had zero tests). A side-effect of relying heavily on student developers is that we have significant, regular turn-over of our developer population, meaning that our codebase needs to exhibit a high degree of stability and clarity to prosper.

2 Likes

Interesting. I see your point, you’ll need some way to say “this is how we do things” and avoid breakage. Which might come naturally (for better or worse) to Java codebases but not so much with Clojure.

Have you found it challenging to do this with Clojure? Are you able to onboard students in a reasonable time frame, or are there challenges? For instance, have you had luck using Spec for this?

2 Likes

I’ve just started (re: yesterday) introducing spec to the teams, and I think it will be most useful for our re-frame front-end to keep the client-side DB consistent. But this is mostly a development help so they can see what the DB should be shaped like, rather than a stabilizer. So far our biggest pro-redundancy feature is making heavy uses of test-cases, which we use both to catch regressions and to guide their efforts (TDD). For on-boarding time, I have found it far better to get beginning programmers instead of seniors or “experienced” folks with a lot to unlearn. As far as on-boarding time goes, it’s definitely steep; but I get much better results than with my low-bar teams (e.g. our PHP team), whose code works at first but might come back to bite me a few years later.

The caveat here is that too much redundancy (“hey, here’s Spec”, “first, tests”) can be overwhelming to folks who are just learning the strange syntax of s-expressions, how to deal with emacs, and the dramatic paradigm-shift of first-class functions.

1 Like

Yeah, I see your point. In a sense, Clojure’s strength (in that you actually can do all these things at the same time, with ease, with composable tools), but still difficult to learn all these things at the same time.

I feel that when you’re writing Clojure, you don’t really get a chance to do the dumb solution first, because the tools that lets you build the dumb solution don’t really exist in the language. You can’t just have a long chain of actions where you assign all the stuff in between each time you need so – doing that would become tedious with nested ifs. And the code would be horrible in any language, but in C#, Java, Python or JavaScript it could at least be a working first draft.

That’s a good observation about the lack of dumb tools; it is a side-effect of the community preference for components instead of frameworks, and the tangible fact that Clojure, by combination of language features and community in general, is definitely advanced language (I feel a separate thread coming through for that). Tying this back in to the OP, though, it means that redundancy is never a default assumption in Clojure because, like the Testivus post above, it all depends on the level of the programmer and the specific needs of the project. The question, then, is how to assess those needs and programmer levels (especially when it’s not only the author in concern, but any present and future collaborators)?

Would you be able to say if there are examples those needs are addressed better in other ecosystems? Or more specifically, can you give an example of how a “more redundant” ecosystem can provide a better starting experience?

I’m asking because I don’t know the answer, and I think Clojure gives a good foundation.

Off topic: I think you’ve made a video or two documenting your org-mode workflow on YouTube. Thanks for those! They were good motivation for me to get further in my Emacs learning journey.

Glad you liked the emacs videos; I’ve been meaning to update/add some more for several years now…

Good question about ecosystems with more redundancy. I think the easiest example is something statically typed (ie Java), where the very typing is a sort of redundancy which Clojure pushes back against. Also in Java, all the boiler-plate might be a type of redundancy (depending on how you look at it). There is also redundancy built in to many Framework-based systems, of which Clojure is notably and intentionally spare. For example, there are built-in checks in the Wordpress world for verifying compatibility with Wordpress version, and in general frameworks by nature need to be built with precautionary redundancy because they are trying to apply to general situations, particularly when there’s a notable community listening in. Things like React.js and Google Closure demonstrate this.

What makes Clojure interesting is that same philosophy which led to gradual typing, which says “let’s become more prescriptive, more redundant on an as-needed basis” which means we have to decide when and how much is our “as needed.” In my case, it’s the need to ensure longevity and also maintainability that factor into this decision, and I haven’t decided yet.

I like this topic. Imho we should reword redundancy to stability or Quality code because redudant systems is imho used in cluster and ha systems. But this thread means more how we can do stable quality system imho

Great point; I think I can leave the title as it is, so readers can have better understanding of the topic, but your observation about the usual context of redundancy is exactly right.

1 Like

I stumbled over this quote:

55: LISP programmers know the value of everything and the cost of nothing.

From Epigrams on Programming by Alan Perlis, 1982 (source). I feel that this relates to the same pain that we touch on in this thread.

Personally, I think this is one of the most useful critiques of Lisp, and dynamic programming languages in general. Common Lisp in particular is hard to use because super-advanced features permetate codebases. If you don’t know how to use them, you’re in for a tough challenge. Racket compensates with a hyper-focus on learnability (see How to Design Programs). Clojure, in my estimation is a little bit harder to get started with than Racket because of (a) lack of an editor like DrRacket and (b) the neccessity to consider host interop. However, Clojure gains immutable datastructures, a rock solid deployment story and solid editors for advanced users.


So what does this have to do with redundancy? Here’s what I think.

  1. In Clojure, we value simplicity over ease
  2. Redundancy is a non-simple (by design, “redundancy requires multiplicity”)
  3. Ease requires redundancy.

To turn this back on topic, perhaps we should embrace redundancy when it provides ease of use, ease of learning and safety. See also Zach Tellman’s contrast of principled systems and flexible systems.