How to explain the difference between side effects and return value to a Clojure beginner?


#1

In the context of my Get Programming with Clojure book, I’d like to explain to Clojure beginners the distinction between side effects and return value.

I came to create the following diagram:

I have some questions related to the diagram:

  1. How should we call such a diagram?
  2. Is it something that is well known?
  3. Is it useful?
  4. How could we improve it?

#2

Hard to say without reading the text around this diagram, but if you really target beginners, you could add the words input, output and side effects to the arrows. It makes the diagram more self-contained and these words are generic enough to be easy to grok in my opinion. Also, I would use external input instead of external factors for the same reasons.

Cannot answer the other questions, I’m not a diagram expert.

Good luck!


#3

I like the diagram, it looks useful. I might call it something like “calling pure functions versus calling side-effecting functions”

Whether it’s well-known depends on the audience—it’s certainly “new” in some sense to people used to imperative side-effect-driven development, and it’s certainly new to 100% beginners just getting into programming.

When I teach this concept I use “somewhere else” to highlight the contrast between side effects and return values. Usually I do this with two or three examples:

  1. I call the fn with value x (or values x, y…), it gives me value y based on x, nothing else happens anywhere in the universe, everyone is happy
  2. I call the fn with value x, it gives me value y based on x, but it also changes something somewhere else (without telling me and which I have to “just know” about); sometimes this is necessary but take care with it
  3. I call the fn with value x, it gives me y based on x and some other stuff from somewhere else (which I have to “just know” about); sometimes this is necessary but take care with it

Of course there’s a fourth situation combining 2 and 3 which may or may not be useful or necessary to explicitly delineate. Since the diagram includes it, I would tend towards being explicit. That also means I would recommend naming that second dimension in the title of the section and diagram title: “pure fns, side effects, and external factors” (or “state”).

I find it useful to state the pure fn approach as “the way”. I then reiterate “sometimes side effects are necessary” from multiple angles, which (somewhat contrarily) helps reinforce the pure-fn approach as the approach to strive for. I emphasize in turn how side effects are usually not necessary, how avoiding these other inputs & outputs makes the fn and system simpler to reason about and prevents surprises, and I point out some specific examples of when it is necessary, as well as how to contain/minimize/mitigate the problems of side-effecty programming.


#4

You can use the washing machine as a metaphor, the return value is the signal lamp that washing machine to complete the work, the side effect is washing clothes.


#5

Maybe being pedantic but wouldn’t you think as the signal lamp as a side effect too, like doing any kind of IO: writing to a file, sending data over the network, displaying something on the screen etc.


#6

I can recommend watching the What Could a Clojure Editor be Like? talk, by Rakhim Davletkaliyev.

He makes a surprisingly useful comparison with Feynman Diagrams.

(Here an electron and a positron collides, creating a photon + a quark and an anti-quark. Plus, as a side-effect, a gluon.)

He then moves on with this and develops a model for how Clojure functions can be represented visually, including side effects.

Not arguing that Feynman Diagrams are well known here. But I do like the explanatory power that Rakhim points at.


#7

I agree that the diagram is a good idea. I think the one you’ve posted is a good start.

If the audience is truly “beginner”, I might start with a diagram that only shows the in->out flow, to master the concept of inputs and outputs / arguments and return values, before introducing additional “stuff that can happen in the middle”. Also, a more dramatic example to drive home the point, like (send-email! ...), but I understand the appeal of a built-in core function.


#8

@krisleech
As you said, it is also a side effect that you see the return value (nil) of print in Repl, and the return value we say refers to the data that the screen information represents.

Beginners need to be able to directly observe the phenomenon to learn, abstraction is not easy to understand


#9

Yes, true. Because we could consider any change a side effect. For example do we consider changing an atom a side effect? Or is it better to look at more dramatic examples like @mhuebert suggested, sending email. In the context of a beginner.

There is some sort of saying that without side effects you just end up with a hot box.


#10

Thank you all for the feedback. I really appreciate it.
I’d like to use this diagram many times in the book, so I think it deserves a name.
Anyone has an idea how should we call such a diagram?


#11

I find it helps to talk about primary effect. If you explain that the primary effect of a function is to map some input to some output. Then, if the function does anything more, all these other effects are the side effects. No matter what they are, all effect that isn’t mapping input to output is thus a side effect.


#12

“in-out diagram” and “in-out side-effect diagram”?

(And then in some languages you might need an “in side-effect” diagram, i.e. if there is nothing that counts as a return value, but only the side effect.)

I’m avoiding “input” and “output” since then “input-output” could get turned into “I/O”, which has a different meaning.

But what do we call the influence on the function from something that’s not a proper input to it? (e.g. the value of a counter atom, or a database state, or a mouse click.)


#13

To me this is a data flow diagram. A function is a “process” in a data flow diagram with at least one input and one output. Data flow diagram also has “data store” (state) and each store have at least one data flow in. Process can be linked to another process or to a data store. When a process is linked to another process it is a composite process/function. Side effect is how the state of a data store can be changed by processes.


#14

I love the name you suggested @rmcv : “data flow diagram”.

What about side effects like printing? Do you also consider them as a change of the state of a data store?


#15

That’s the problem with “in-out diagram” name: it doesn’t express the fact that the code might depend on external factors.


#16

For me, changing an atom is a change to the state of a program. Therefore it is definitely a side effect.


#17

What about side effects like printing? Do you also consider them as a change of the state of a data store?

Yes printing is like an append only database/datastore.


#18

I think it depends what simplification is best for beginers, and what point you are making with side effects.

Not all side effects are equal.

Also, why are you defining a side effect category? Is it because you are implying something about them? Like say, that they are a source of bug and hidden complexity?