Hello!
Introduction
There’s this problem that’s been scratching my mind for a while. I’ve been toying with this Monte Carlo system. With the Monte Carlo method, you sample a bunch of random variables, calculate your results based on each simulation, and can then say something about the probability of the results, and how they relate together.
For this to be the case, I’d like to sample a sequence of maps. Let’s say we sample :x
and :y
between 0 and 1. Sampling 3 values might get us something like
[{:x 0.434 :y 0.342}
{:x 0.936 :y 0.126}
{:x 0.028 :y 0.123}]
Problem statement
I want to be able to generate the same sample (sequence of binding of variables) from a single seed, and I want the computation to be parallelizable.
- Deterministic: The same sequence will be produced from the same seed and sample size.
- Parallel: The sampling process should be possible to split across cores, where each core gets the global seed and the indexes it should produce (for instance, core 3 may be asked to generate values 200-399)
Ideally, I would like to request the j’th item from the sequence in a fast way.
I feel like I’m missing language to describe what I’m looking for here, and that there might be literature describing what I’m looking for. I’m also looking for work making this accessible in real world implementations.
There’s also test.check. I haven’t looked at the implementation, but with test.check you get deterministic generators. After each run, you are left with a seed, and you can use that seed to generate the same bindings. I’ll take “Go read test.check source” as a reply, but I’d prefer if that came with some reasoning.
Clojure context
I’m able to produce a deterministic, serial solution and a nondeterministic, parallel solution. Reprinting here for some context:
(ns th.scratch.deterministic-parallel)
;; Can we have deterministic, parallel randomness?
;; Deterministic serial randomness:
(defn sample-serial-deterministic [n seed]
(let [rnd (java.util.Random. seed)]
(repeatedly n
(fn []
{:x (.nextDouble rnd)
:y (.nextDouble rnd)}))))
(let [seed 987324]
(assert (= (sample-serial-deterministic 5 seed)
(sample-serial-deterministic 5 seed))))
;; Nondeterministic parallel randomness
(defn sample-parallel-nondeterministic [n]
(pmap (fn [_]
(let [rnd (java.util.Random.)]
{:x (.nextDouble rnd)
:y (.nextDouble rnd)}))
(range n)))
(assert (not= (sample-parallel-nondeterministic 5)
(sample-parallel-nondeterministic 5)))
I’m hoping to get some feedback on this! And I’m putting this thread next to the watercooler, since I’m looking for more of a discussion then help coding up a solution.
Teodor