What's purity for?

I’ve been thinking about functional purity lately, and I know Clojure is an impure functional language. So it got me thinking…

What do we lose by not being strictly pure?

At a practical level, what is strict purity used for? Things like… allows compiler optimization X, protects from race condition Y, etc…

And I’m talking about cases where IO and state is required. Delaying IO such with the IO Monad, or managing state behind a state Monad to maintain an intermediate purity. What’s the advantages of this?

Thanks

Strict purity of an entire language and system? I don’t know, except perhaps things where one reads one or more files of input, and writes one or more files of output, and then terminates.

For Clojure and pretty much all programming languages, purity is good for easier reasoning and testing for the parts of your program that consist of pure functions.

Ya, I’m talking about strict purity. Like what is gained by encapsulating IO in an IO Monad, or by enforcing uniqueness types for IO, and other such things.

Beyond the ideological can we achieve a fully strict language, does it have any real concrete benefits?

It seems best to ask an audience of those convinced of the benefits of a language that makes side effects much more restricted, e.g. Haskell. You might find such people willing to discuss it on lambda-the-ultimate.org, or a Haskell Subreddit, perhaps.

Functional purity is a means to an end… in Haskell that end is is strict mathematical reasoning at the cost of some cognitive effort, In Clojure it is all about reducing the cognitive effort itself, so you can focus on the actual problem. Eric Normand has a nice categorization in his book “A theory of Functional Programming” that he’s currently writing.

He defines three fundamental building blocks of (functional) programming languages: data, functions and actions. Data is obviously the most simple thing, just values and their structure. Functions are, well, pure functions. Actions are functions where the outcome depends on one of two things: When they are run or how often they are run.

Actions (the impure part) have a tendency to “pollute” other parts of the program. If I have a function that e.g. reads a value from a database at any point, it automatically turns that function into an action, making it much harder to test and reason about. Thus, actions should be pushed “as far out” as possible in your program, so you can maximize the part of your program you can easily reason about.
But in the end, of course, every meaningful program needs to interact with the outside world. It’s just a matter of big the portion of the program is that is “protected” from the action’s polluting effect. Different languages offer different mechanisms for this, but there is no “entirely pure” language, just different techniques to constrain impurity.

I think the relationship between data and functions is also interesting. Data is the most simple thing in this model and it’s often best to push things down all the way to the data level. If you take the difference between Compojure and Reitit for example or the introduction of event-fx in Re-frame, there seems to be a tendency to use fewer functions and more data. Making the program more declarative often achieves the same goals: it makes it easier to test and to reason about.

Pure programs can be treated as isomorphic to proofs and thus you can know things about pure programs that cannot be known about impure ones. This is very interesting in several ways, including in the context of the Halting Problem. If you enjoy CS theory and would like to know more, check out the Curry–Howard correspondence.

1 Like

Purity within a scope can often enable optimizations and behavior that isn’t possible without it.

  • In databases (and Clojure STM), transactions can be rolled back and retried freely since they are pure. Violating this constraint by adding side-effects within your transaction means that you cannot roll back a transaction or safely retry.
  • Similarly in React.JS, if you apply the constraint that render functions are always pure, you can park a render tree and prioritize another queued render, and then resume computing that VDOM later. This is what enables React Suspense and the performance optimizations in the upcoming Concurrent Mode.
  • AOT and JIT compilers can both take advantage of purity to replace function calls with constants, cache function calls and objects, and many more.

In Clojure and other dynamic langs we do this by convention, and suffer the consequences when we invalidate it. Language constructs like Haskell’s IO monads and other languages typed effect systems are a way for both the developer and compiler to make strict assertions about what contexts these behaviors and optimizations can or should be applied in.

IMO it is important to separate the efficacy of purity - which is easy to measure in objective terms like performance, or enabling of certain behavior otherwise unavailable - vs. using the type system to track side effects, which is much harder to measure what it might be effective at and why.

1 Like

That part is basically why functional programming is better than imperative/oop. Functions of input/output and functional composition. I understand the concrete benefits of that, immutability of data and computation modeled as a flow of immutable data. This is great to model computations. What I’m asking about is modeling effects, and is pseudo-purity a la Haskell better for that as well? Or at least, what are the trade offs between it and the impure approach?

Right, and that’s the part I’m asking about. And not even from the type system, the runtime could do it as well. But this encapsulation of IO into thunks, what’s it good for?

So till now, no one has any clue of the concrete benefits of having an IO Monad and waiting till main to execute it.

I think @hankstenberg was the closest to provide a value add, in that it can possibly encourage minimizing the radius of functions that IO infects. But even that’s not obvious to me. I could have the same amount and just mark them all as IO. But maybe the awareness helps the programmer to design things a bit differently.

Knowing what functions are pure and not I can see maybe being leveraged for optimization. Like performing auto-memoization and function reordering, auto-parallelization, etc. Some of that can be achieved with just best intention as well within given contexts. Like with reducers for example.

I know at this point I could ask on a Haskell board, but the bias is strong there, and I don’t want to be seen as trolling.

My goal is understanding the trade offs between a pure language like Haskell and an impure one like OCaml or Clojure. Only the part related to pure/impure.

For example, you could write IO in a pure fashion in Clojure as well. Wrap all IO inside a delay and all functions using IO have them return a delay. Then derefs the chain of delays only in main. It wouldn’t be as ergonomic as Haskell’s IO Monad and its additional support for it, but you could do it. Why though? Would I benefit from such a design?

I am a little confused because you say in reply to my post that you’re asking about whether it’s useful to track it in the type system, but the majority of your post is talking about and leading to a question of the benefits of purity regardless of whether it’s encoded in the type system.

For example, you could write IO in a pure fashion in Clojure as well. Wrap all IO inside a delay and all functions using IO have them return a delay. Then derefs the chain of delays only in main. It wouldn’t be as ergonomic as Haskell’s IO Monad and its additional support for it, but you could do it. Why though? Would I benefit from such a design?

You wouldn’t get any benefits on its own. But like I enumerated in my first post, when you combine this with a runtime that can do this reordering / retrying / auto-parallelization like the STM or React.js fibers then you would see concrete benefits in maintaining a separation of pure and impure code.

For instance, the STM and React.js both have mechanisms for encoding side-effects in the context of a function that never actually causes the effect unless the transaction or render completes successfully. In a statically typed language you could encode this in an IO monad. in Clojure STM you are expected to only use “transaction safe” functions like ref-set and alter. In React.js you have hooks like useState and useEffect. All of these provide the ability to encode side-effects in a pure way which will occur when an external runtime chooses to, instead of when the function is executed.

In this way, you don’t need a separation of “actions” and “functions” at the code level but instead at the runtime. This is very useful when your data, functions and actions have dependencies on each other (often the case in any non-trivial app). Splicing your application logic across the lines of pure vs. impure isn’t very ergonomic.

This is why I think it’s useful to look at how useful the type system is at handling this separately, because the benefits of typed vs. untyped isn’t really apparent once you’re in the runtime; the benefits of a pure function are there no matter what.

I’m not really concerned about the static type system, but I don’t want to discard it because sometimes some things are only useful if you also have one, such as for sum types which really only start becoming useful when you have something to remind you to handle all cases.

I’m talking about purely functional languages vs impure functional languages.

The difference would be that in a pure language, all your functions must be pure, no exception. In an impure one, you’re allowed to have a mix of pure functions and impure ones (might as well call them procedures).

Now I understand why given some logic which could be implemented using mutation and use of non local data, such as you would in OOP or imperative language, that you’d want that to instead be implemented with pure functions such as done in FP.

But programs deal with inherently impure effects, like moving a file from one directory to another, sending an email, querying a database, etc.

Due to this truth, pure programming languages must find a way to handle these effects and still pretend to be observably pure.

Haskell does so by having the functions which perform effects such as IO not really perform the IO, but just remember the list of all effects and their order and wraps all that in a chain of IO actions including wrapping all functions that use the result of IO as well inside thunks. So now, your function that deletes a file? Well it doesn’t delete the file, it just returns a data-structure that says please delete this file for me.

What happens next is that the main function will take these data-structures and will perform the effects that they ask of it. So main will go and delete the file.

I can’t seem to come up with any practical benefits to this, apart from conceptual consistency in that you can kind of still pretend to be a purely functional language.

Doing that doesn’t require a static type system, or even something tracking which function do IO or not. All of it can be modeled with discipline in Clojure for example.

So I’m curious, are there any benefits that I’m failing to think of? Could the benefits only manifest itself when also combined with a static type system?

If I wrote Clojure code like that, would I gain anything?

Logical processes involving IO require special care, because you need to handle nondeterministic failures (network partitions, latencies…). You need to implement some kind of supervision mechanism to make sure unrecoverable failures are propagated somewhere. If your process is forked at some point, you have to make sure it’s gracefully shut down when another part of the process prematurely fails, such that you don’t leave any pending resource allocated down the chain.

Modelling IO as values can provide an elegant solution to this problem, because it makes sure all IO operations follow a strict hierarchy where errors can propagate up to the root, and cancellation can propagate down the leaves. This is a direct consequence of FP style, and unrelated to static-vs-dynamic concerns.

Alternative solutions exist in the imperative world, such as actor supervisors, or more recently structured concurrency. I’m not really an expert on those techniques, but I don’t quite like the additional ceremony they require, and I would happily be challenged on pure-vs-impure concurrency techniques for nontrivial use cases.

Interesting. Can you talk more about this. What would be an example of this?

I wrote about it here, you’ll find a concrete example. See also the blog posts from Nathaniel J Smith, esp. this one for a good explanation of the pitfalls associated with traditional imperative concurrency.

Maintaining a hierarchy of running processes has several benefits :

  • when a process fails it can propagate the exception to its parent
  • when a parent process detects a failing child it can make an appropriate decision such as cancelling the siblings
  • the lifetime of child processes are enclosed by their parent, so resources allocated by children are guaranteed to be freed when the parent terminates

Now, when a process is forked with an imperative construct (e.g with future, go), it doesn’t know which one is its parent. You have to program defensively, and implement your own error handling and cancellation support. You may end up building a supervisor hierarchy by hand.

In contrast, in functional style, logical processes are values. If a process is defined as a composition of other processes, the parent-child relationship is made explicit by construction. When the process is ultimately spawned, the process data structure is naturally materialized into a supervisor hierarchy.

All of this is awesome. I had previously looked at missionary and read that article about nurseries but had never fully put things together in my head.

One question, do you believe this is true even without the presence of concurrency?

In that, without concurrency, the call stack seems to be enough to keep a tree topology of the execution and supervisors are automatically parent functions, and normal try/catch can be used to handle failures, and cancellation isn’t required.

Another question, implying the former question is true, that concurrency brings in the value of pure IO. Do you think that the call stack is the problem? Could you envision a language with support for concurrent call stack? And is that not what stackful coroutines bring to the table?

And specific to missionary. Couldn’t every function be a m/?. And couldn’t m/sp be just the default. As if you had

(defn? foo [name]
  (println "Hello ")
  (println name)

;; became

(defn foo [name]
  (m/sp (m/? (println "Hello "))
        (m/? (println name))))

So basically all IO is modeled purely. And in non concurrent context, I’d just always want everything to be sequential processes waiting for the prior task to be done until starting the next one. At this point, my code would look exactly as normal Clojure code apart from using a different defn? and fn? which encapsulates tasks and therefore perform purely until they reach main? which wouldn’t have m/sp but just m/?

And only when I need concurrency would I introduce m/join and others?

Or am I misunderstanding something?

Yes. Without concurrency the tree degenerates into a list, so the natural thing to do is to transfer control flow to the IO operation. Cancellation is still required but that’s well supported on the JVM via Thread/interrupt.

Augmenting the thread abstraction with supervision features is pretty much the value proposition of structured concurrency, and project loom’s fibers seem to follow that path. The basic idea is to maintain the supervisor hierarchy in dynamic scope, such that when you spawn a child process, the parent will keep track of it, catch failures, join it on termination, cancel it on sibling failure, etc.

There’s no technical limitation to implement something like that in clojure(script) right now. It would require to implement a flavor of future that’s able to run in a managed context giving access to some kind of supervisor. Optionally, async/await-like syntax can be provided with a macro. However, one criticism made to this approach is that IO-performing functions contaminate their callers, your functions now have a color. That’s where stackful coroutines (continuations, in project loom) come into play. If at any point in space and time you can capture the stack state and reach the scheduler to tell your intent to maybe continue later, all of that in user space, then the sky is the limit (not only for IO).

Now, is all of this a good idea ? Time will tell. Project loom is still experimental stuff, it will probably take a couple of years to be production-ready. It will take another undetermined amount of time for the industry to understand the problem and for third-party libraries to switch.

Regarding your question about missionary’s design. I don’t think making IO transparent is a good target. IO involves networking, networks are unreliable. Knowing what to do in face of an unreliable thing is a business problem, it has to be first-class, not hidden under the rug. That’s additional complexity, but it’s essential, not accidental and we have to face it. That’s why I disagree with Bob Nystrom on this point, and I see no problem having red and blue functions.

I was thinking about that, but what could trigger the cancellation if we’re in a non concurrent state? There shouldn’t be anything that could cause an interrupt. The process itself could decide to abandon its work. Unless you mean an actual kill signal?

I believe this is what Rust did with their official Futures and async/await behavior.

I think your point is that any function making use of another function should know the risks in making that call? Such that if I call something which under the hood makes use of IO, I need to be prepared to handle its failure scenarios? I also need to be aware that it probably can’t be memoized, possibly has a result dependent on the environment, that I shouldn’t just willy nilly retry it, etc.

I think that’s a fine goal. I did felt what I was suggesting fit that goal perfectly though. I’m not saying to overload defn, but to have a defn? . That to me is even more explicit about the fact the function is side-effecting. It could even add some meta to the fn so it can be introspected as such. Otherwise how do I know some function I’m using makes use of missionary under the hood?

The rest just felt like added convenience. m/sp feels like the most common use case.

Anyway, I’m not here to discuss too much about your API choices. I’m more interested in understanding. And the part I wasn’t sure is if what I’m proposing makes sense technically, in that it could work that way. Or is there some issue that I haven’t understood?

Specifically, I’m not sure I’m totally getting why m/? must be used when calling m/sleep for example? But it doesn’t have too when calling println?

Thanks for the discussion by the way, I’m learning a lot

In theory, you’re right, no concurrency means nothing can interrupt your process. In practice, you’ll always eventually reach some kind of concurrency, at the very least you want to shutdown gracefully on SIGTERM.

My knowledge about rust is close to nil.

Because the underlying representation of tasks doesn’t rely on threads. ? means the rest of the computation depends on a result that won’t be immediately available, and it’s what allows the sp macro to rewrite code in a non-blocking way. That’s why it’s needed for sleep and not for println (technically, println may block on the JVM, but we can assume the cost is negligible). This is very similar to how go and <! >! work together.

I’m not sure to understand what your suggestion is and what syntactic style you want to achieve, but as it slightly deviates from your original question I’d happily answer any missionary-specific question on the #missionary slack channel.

I’ll ask more missionary specific questions on the slack.

For now I have to say concurrency is a great point of making IO functions pure. By building the chain of IO and their dependencies, specifying what must wait for something and what doesn’t need too. You can control the execution of it so that things happen concurrently which can, and sequentially which cannot. And you can even introduce in between each action logic to automatically cancel siblings on error, or propagate errors to the parents.

I think that’s the most convincing use case I’ve got yet.