I’ve just pushed a major release of
missionary. Previous version was a simple functional effect system, and this new version adds support for streaming. It’s still experimental stuff and there’s a few minor breaking changes, but I plan to switch to a production maintenance process in the near future.
Processes as values
Functional programming gives structure to concurrent processes for free. Functional composition enforces a hierarchical topology, which means exceptions can naturally bubble up and cancellation can naturally bubble down. This is the core idea of
missionary from day 1, it turned out to work very well for single-value producers, so multiple-value producers follow the same idea.
This is not a silver bullet either. This model somehow binds you to a hierarchical model, and shows its limits when you need to model graph topologies. I’ve still not found any practical way to model graph dataflow structures in a functional style. For instance, Rx fallbacks to imperative style (processors), and akka-streams has a functional graph API but it’s arguably complicated and limited to static topologies. I’m still experimenting on this topic.
The flow protocol
flow is the unified protocol modelling multiple-value producers in
Original FRP formulations (Elliott, Hudak) introduced the idea of continuous time, and that idea sounds right to me. Unfortunately it has been largely misunderstood, which led to a lot of confusion between library authors claiming to implement FRP ideas and original authors defending their model. As far as I know, none of currently popular streaming engines properly implement lazy sampling of continuous values.
The duality between discrete events and continuous values is very clear nonetheless. A continuous signal has a notion of current value, that’s a stateful identity, typically something you would represent with a reference type in clojure (git branches, immutable databases). A discrete stream has a value only when it happens, and this value is ephemeral, its purpose is to be aggregated into something more persistent.
So here’s my take on it : discrete events and continuous values can be unified under the same protocol. It really boils down to different transfer strategies : discrete events are backpressured, and continuous values are lazily sampled.
- A discrete producer can’t discard events, so it must propagate backpressure when consumer is too slow.
- A continuous producer can discard old values if the consumer is too slow, because the new value invalidates the old.
- A discrete consumer pulls events eagerly, because they all matter anyways.
- A continuous consumer pulls values lazily, because only the latest matters.
flow protocol I came up with is actually quite simple and allows each of these strategies. Conceptually, a flow is like a
java.lang.Iterable, except availability and termination are notified asynchronously. It’s a factory function taking two zero-argument callbacks, a notifier and a terminator, and returning an iterator that is callable (for cancellation) and
derefable (for iteration). The producer calls the notifier when it’s ready to emit, then the consumer
derefs the iterator when it’s ready to consume, transferring the value.
The major difference with protocols usually found in popular streaming systems is that a producer can inform the consumer of the availability of a value without transferring it. This property allows lazy sampling, without compromising backpressure propagation for discrete events.
Reactive Streams is basically a standardization of Rx’s internal protocol. It’s rather complicated, it doesn’t support lazy sampling, and it doesn’t support graceful termination (you can cancel a subscription, but the spec doesn’t allow post-cancel communication so you don’t have any way to know when allocated resources will actually be released). However, it exists, it’s now part of the JDK, and more and more libraries support it so it’s good to support it as well for interoperability.
Previous version had
sp, a coroutine-based macro for
task definition. It turns out
flow equivalent, relies on an near-forgotten idea I found in SICP, section 4.3. The design is slightly different, but there’s still this powerful idea of operators able to fork current evaluation, and run it an arbitrary number of times with different return values.
It turns out to be highly expressive, and I think it deserves more momentum.