Signals vs Streams

dustingetz · March 5, 2023, 8:16pm

I actually wrote this for HN but thought it might make a good discussion here. Tons of stuff didn’t make the cut, such as discussion of discrete/continuous mental models, how to convert between discrete/continuous stages and the relation to derivatives/integrals in Calculus, how to have pipelines of interleaved discrete/continuous stages and how to deal with backpressure semantics at the boundaries. The Clojure/Script library that implements all of these primitives is the illuminating Missionary.

geokon-gh · March 6, 2023, 3:17pm

I like the abstraction of splitting between signals and stream. It’s helpful - especially when it comes to the split between sideeffectful-nonrepeatable-streams vs pure-repeatable-signals. Not really speaking from too much experience, but from what I understand it kinda feels like the design is sidestepping scheduling. And it’s in a way where I don’t see how the user is supposed to handle it. It could be I’m thinking on the wrong level of abstraction somehow.

I’ll try to give a concrete example. In my UI I have a complex weather map that takes a few seconds to compute/generate. The user inputs some parameters and then asks for the map to be displayed.

When i look at it through the lens given to me by Missionary:

With signals you’d freeze up your UI when you ask for the map to be displayed
With streams you’d freeze up the UI when all the parameters are input (and the map is generated in the background)

You’re right that you don’t wanna render stuff like stale views, so I can see the value in lazyness - but I think you often wants something sort of like a not-so-lazy signal that runs in a “nice” low priority mode. You often don’t want to wait around till you’re called to start computing things. (granted, my example leave it a bit unclear what you’d display in the meantime while things are being computed in the background)

you explain how using a stream instead would be suboptimal - it totally makes sense - but how would be compute values derived from signals in a more eager way in this framework?

" Signals can skip duplicate values." is this referring to the memoization? Does this just remember the last computed value or is that configurable? You’d probably want just the last value 95% of the time, but if your UI has say an on/off toggle you’d probably want to remember two values.
" Signal laziness is what enables this “work skipping”; " Is this saying that you call for some value to be computed - before it’s finished an underlying signal changes - the dependent “job” is prematurely terminated and just the relevant parts of the DAG are recomputed with the fresh signal? (this would be really cool! doing something like this manually is a nightmare)
How does Missionary handle multi threading? Your pure signals and the DAG seem like a boon for autoparallelization. Each bubble on the graph effectively a task (makes me think of something you’d put together ad-hoc with core.async)

PS: thank for your insights on reddit

dustingetz · March 6, 2023, 4:13pm

Does your “map rendering” example generalize to “call a blocking function which takes a long time to compute” i.e. recursive-fibonacci? In Missionary we would move the blocking computation to a threadpool with (m/via m/blk #(fib x)), or in Electric with (e/offload #(fib x)) which does the same thing. m/blk is a java.util.concurrent.Executor, so you can customize this. For web browsers, you’d need to move the computation to a web worker.

Does this answer your first question?

dustingetz · March 6, 2023, 4:25pm

Yes, Missionary (transitively Electric) propagate cancellation notifications when a signal switches (and issue ThreadInterruptException in the right place). This is not really related to work-skipping, this is process supervision (like Erlang).

Does this just remember the last computed value? You’d probably want just the last value 95% of the time, but if your UI has say an on/off toggle you’d probably want to remember two values.

Missionary and Electric are not “history sensitive”, which means you’ll lose memo buffers when a conditional node “switches the DAG” (imagine a railroad switch). An atom in userland will trivially mitigate the edge case, it’s uncommon in the type of applications Electric is designed for. See Breaking Down FRP (2014) from Jane Street for some discussion of this tradeoff (note the blog post is very old, it contains misconceptions about continuous time signals, but the discussion of history is good).

geokon-gh · March 10, 2023, 8:59am

Thank you for trying to explain things. I’m sorry I hadn’t replied earlier - I just wanted to start trying it on concrete examples/problems before saying more.

I’m still trying to massage things. As I understand, the documentation is a bit in flux - so I’m trying to piece it together. From what I gathered there was a bit of a design change with issue #70. I just want to confirm that the whole “Task” and “Flow” thing from the README (and wiki) is older terminology. Now “Flow” has been separated into a “Signals” and “Streams”. Is my understanding correct?

I’m still trying to grok when you’d even want a task though. It seems to be a lower level concept at this point. In the full FRP model it doesn’t look particularly necessary (unless you’re passing around a thunk for some reason)

dustingetz · March 13, 2023, 8:49pm

The upcoming missionary design changes have very minor impact on API, it’s more about the internals.

Now “Flow” has been separated into a “Signals” and “Streams”. Is my understanding correct?

No, there hasn’t been a terminology change, allow me to clarify.

There is a subtle difference between a continuous flow and a signal. The difference is that signals and streams are stateful, they memoize the results to achieve observable sharing. Flows are values that describe a pipeline (whether discrete and continuous).

Because flows are values, they are stateless and referentially transparent, which means the same flow values can be reused many times. (JS promises can be used only once!)

Flows can be arranged into DAGs, but for efficient reactive updates, we need to memoize each shared node in the DAG by allocating state. This state is the difference between a continuous flow and a signal (also a discrete flow and a stream). Without the state, you don’t actually have a DAG and no work will be skipped.

m/signal! and m/stream! are current missionary operators that you use to say “this exact point in the flow is shared” (i.e. a memoized node in the DAG). Memoization is not automatic, because Missionary is designed to give total control to the programmer over every aspect of the computation. (In Electric Clojure, memoization is automatic, because Electric is designed to be easy, and when you hit an edge case you just drop down to missionary.)

A future version of Missionary is dropping the ! suffix from these operators because, technically, they will not be effectful anymore. But that’s an implementation detail.

PS here is a Missionary concept map that we are working on.

HolyJak · July 22, 2023, 7:21pm

Thank you for the article! Could you be so kind and clarify what you mean by

Signals have an impedance mismatch with isolated discrete effects (without a corresponding undo operation), because backpressure will discard events and corrupt the system state.

system · January 21, 2024, 7:22am

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.