What are some simple use cases that demonstrate why immutable data collections are beneficial?


#1

I am looking for ways to explain to non-clojure developers why immutability is beneficial in the context of data collections.

Can you share ideas for simple use cases where it is easy to grasp why immutability is an advantage?


#2

Hum…

There’s the obvious parallel and multi-threaded use case. Where when each parallel process work with their own copy, they can’t accidentally trip each other up, and thus don’t require lock or coordination schemes.

There’s also something I’ve seen often. Where you pass in an object to a method, but the method changes something about the object you have, without you knowing or realizing it is doing that. And later, you use that object assuming it is as it is. But you get subtle bugs because the object isn’t as you think, its been changed under your feet by the method, without you realizing.

I think those are the two main ones.

I’ve also seen cases where you have say a for-loop or for each loop which mutates the collection it is looping over, as it is looping. These are atrocious to read and understand even if they work without bugs. And most of the time, they have bugs, because it’s so hard to reason and understand them.

So basically, the big argument is that it makes for more correct software, with lower defects, as it eliminates some classes of bugs such as the ones that can be present in the examples I just gave.

In some circumstances you can also leverage the property for performance or recovery. For example, it can easily let you perform rollbacks or retries in the case of a failure. That’s what swap! on atom does.

Say you successfully set endpoint on an object. But then fail to set port. The object is in a corrupted inconsistent state. With immutable data, this can’t happen. The full data structure is created in one go. And it either fully succeeds or fails. Basically, you don’t need the builder pattern for these scenarios. If the set was happening in a swap!, it would automatically retry.

For performance, React is a good example. Diffing what has changed to know what to re-render if you assume everything could have changed requires a full deep search over everything. But if most things are immutable, if they have the same reference as before, you know they hasn’t changed and thus don’t need to be re-rendered.

Oh, and one more on the equality topic. A big advantage that is due to immutability is Clojure equality semantics. Being able to compare two collections and say they are equal if their structure and content is equal. That is really nice. No need to write custom hash methods, just put things into sets or hash maps and it’ll just work. In other languages, you always have to define what does it mean for object A and B to be equal, and there’s confusion around should you use =, ==, ===, .equals. etc. Clojure simplifies that quite a lot in most cases, due to immutable collections.


#3

I don’t know about taking immutability in isolation, but we can try. It is an irony that “Think Python” and “Effective Java” suggest immutability but, since almost everything in their world is mutable, they mostly teach discipline instead. So, about these developers you are speaking of –

Do they know to return only a “safe copy” of mutable members from a method? Do they thoughtfully choose when to use a shallow copy or a deep copy? Do they carefully write programs to be appropriately thread-safe, and indicate what is thread-safe and what is not? Do they judiciously apply these techniques, so as not to unduly burden the program with making totally unnecessary safe copies and so on? Do they sometimes fix bugs by adding a “safe copy” or protecting something from multi-threaded access?

If yes yes yes yes yes, then the use case is “not waste any more time on a losing battle”.


#4

I think I would build (or imagine building) a simple web interface. At the top it would say “Select your winnings” in big bold letters. Underneath would be 5 buttons. The first 4 would be labeled “$25”, “$50”, “$75”, and “$100”, respectively. The 5th button would have a label that would change to a random amount between $1 and $100 every 200-500 milliseconds, and whatever the value happened to be at the time you click, that’s the amount you win.

The first 4 values, of course, are immutable, and the 5 is mutable. Which would you click on?

This example illustrates that, with mutable variables, the value is bound to time–at different times it has different values, and you have to take time into account whenever you use it. If you have multiple variables, you have to coordinate time between all of them, so that if you compare A to B, you know at which point in time each one has the value you expect it to have. Otherwise the results can be unexpected.

The chief benefit of using immutable variables is that you remove the element of time from your calculations. You can still do calculations about time, but the values you work with are not bound to time in the sense that they can be different at different times. If you double the radius of a circle, the circumference doesn’t become bigger at some point in the future. The relationship between radius and circumference is time-independent, and thus simpler to work with than it would be if you had to worry about how long it took the circumference to change after you changed the radius.

It’s like the buttons on the “Select Your Winnings” page. It’s easy to pick the desired outcome when the button labels don’t change in the middle of trying to click on them, and much harder when they’re mutable and can change out from under your click at any random moment. And the same principle is surprisingly applicable to broad areas of computer programming. State is inherently time-bound, so you can’t eliminate time completely, but if you can make as much of your code time-independent as possible, it becomes both simpler and more reliable.


#5

Thank you @didibus, @Phill and @manutter51 for your helpful answers.
During my research, I have read again this wonderful article that explains the differentiation that Clojure makes between identity and state.

In a nutshell:
An identity is a stable logical entity associated with a series of different values over time.

In classic OO, an identity is a state
In Clojure, an identity has a state.