Spec and post condition on lazySeq

Hello,
I’m trying to learn spec and during my experiments, I encountered a problem that hopefully someone will help me solve.Here it is :

I want to use :pre and :post conditions on a function that will receive a list of strings, and returns a list of capitalized string (not very useful, but good for illustration). So I wrote this :

(s/def :exp/username   string?)
(s/def :exp/players   (s/coll-of :exp/username
                                 :kind     list?))

(defn capitalize [players]
  {:pre  [(s/valid? :exp/players  players)]
   :post [(s/valid? :exp/players  %)]}
  (map str/capitalize players))

(defn start []
  (capitalize  '("bob", "alice", "tom")))

With the above example, the :post validation fails. I assume this is because capitalize function actually returns a LazySeq and not a PersistentList, list? predicates returns false.

Is it correct ?

Assuming this is correct, I tried to use doall because I understood it would consume the entire sequence and so, maybe, would return a PersistentList, but from what I could see it doesn’t change the list type.

Eventually I force type conversion applying list :

(defn capitalize [players]
  {:pre  [(s/valid? :exp/players  players)]
   :post [(s/valid? :exp/players  %)]}
  (apply list (map str/capitalize players)))  ;; <--

And then it works… but I have doubts :

  • is it the good way to proceed ?
  • is it a good idea a lazy sequence in the first place ? I assume there is no other way but consume the entire sequence, which cancels the benefit of lazyness
  • should I use seq? instead of list?

Any clarification and advice on this subject will be very welcome.
Thanks in advance
ciao
:sunglasses:

coll-of will default to the more general coll? if you don’t supply an option for :kind:

(s/def :exp/players   (s/coll-of :exp/username))

which works based off an interface that all clojure collections (and typically user defined ones) implement:

user> (source coll?)
(defn coll?
  "Returns true if x implements IPersistentCollection"
  {:added "1.0"
   :static true}
  [x] (instance? clojure.lang.IPersistentCollection x))

The clojure sequence library functions (like take, map, filter etc.) - when provided a collection - will project input onto a seq if possible (by invoking seq on the input) and then do their work through the seq abstraction, yielding a seq in return. It is idiomatic to then “pour” the result into concrete type if you need to, otherwise you can just deal with seqs. [Note - if you don’t provide a collection argument, then the semantics change and most of these same functions yield a transducer instead of a lazy sequence, which is another topic…]

Semantically, the difference between lazy sequences and persistent lists is minor, since you can typically substitute one for the other in most operations without noticing a difference (e.g. lists are seqable?) - excepting count. Operationally, seqs are a potentially unbounded lazy sequence of values that are realized and cached on-demand, while lists are fully realized and avoid some overhead (e.g. count on a persistent list is constant time while you have to traverse a seq to count it).

doall will merely traverse a seq and force its elements to be realized (and cached), returning the head of the original seq (so it’s a seq->seq function). This is why your orginal list? predicate failed and why the coercion in the second version worked (seqs do not IPersistentList extend, but lists do).

It is typical to target the more general abstractions unless you have an operational need for a specific concrete type (e.g. guaranteed bounds on complexity for certain operations). It may be important to ensure that the result is a vector rather than a list or seq if you are leveraging operations that are efficient on vectors like nth count, since the core libraries will implement those operations generally on sequences as well and almost anything can be viewed as seqable. For something like this example, unless there is an operational need for a list, I would just go with the flow and expect a coll? if you want to allow any clojure collection type (which includes seqs), or seq? if you specifically want lazy sequences and preclude things like vectors, lists, maps, sets that may have seq projections, or seqable? if you want things that can be coerced via seq).

I would probably just go with coll? or elide the :kind argument altogether (same result).

2 Likes

Thanks for this clear explanation.

Regarding the spec, correct me if I’m wrong but using this spec :

(s/def :exp/players   (s/coll-of string?))

when checked (s/valid) will need to enterly consume the lazy sequence in order to validate the string? predicate on every item right ?
On the other hand, the following spec will not consume anything and just check for the type :

(s/def :exp/players   coll?)

If this is correct, it means that using spec is not only performing pre and post conditions on functions, it may also affect the execution (like for example force evaluation of a lazy seq).

:sunglasses:

coll-of will necessarily traverse the collection to ensure every item validates against string? in order to match the spec. coll? is just going to dispatch on the type. So yes, coll? will not force the sequence to be realized, but is is less useful than s/coll-of because it doesn’t tell you anything about the contents. That is the primary utility of spec: expressing arbitrarily complex specifications about the type and shape of the input (instead of just type checking).

user=> (def xs (map (fn [x] (println [:hello x]) x) (range  10)))
#'user/xs
user=> (coll? xs)  ;no side effect from printing....
true
user=> (first xs)  ;;lazy seqs are chunked in 32 elements, so we actually force <=32 entries to realize and get printing as a side effect.
[:hello 0]
[:hello 1]
[:hello 2]
[:hello 3]
[:hello 4]
[:hello 5]
[:hello 6]
[:hello 7]
[:hello 8]
[:hello 9]
0

it means that using spec is not only performing pre and post conditions on functions, it may also affect the execution (like for example force evaluation of a lazy seq)

Yeah…

user=> (s/valid? (s/coll-of number? ) (range Long/MAX_VALUE))
;;waiting for out of memory error....

This is not just true of spec, but any clojure functions that are eager. count reduce dorun doall into every? some vec set etc. will all partially or fully realize a sequence since they traverse the elements and if the head of the sequence is retained somewhere, then the realized portions will be retained.

OTOH, you have no way to validate the contents of the sequence without realizing some or all of it. If you are not holding unbounded/infinite/or huge sequences in memory, then you should be okay. Even then, you can use clojure.spec.alpha/every to perform non-exhaustive checking of the input.

user=> (s/valid? (s/every number? ) (range Long/MAX_VALUE))
true  ;;returns almost instantly, since s/*coll-check-limit* is 101

Aside from potential space considerations (seqs that are retained and realized may exceed memory, when the caller expected earlier realized-but-no-longer-used portions to be garbage collected during processing) the side-effect of realizing seqs isn’t that big of a deal since they are immutable. The implicit realization and caching is a benign effect and effectively unobserved (from the perspective of getting values from the seq). The value of the seq is unaffected (hashes the same, yields same equality and comparisons).

2 Likes

Thanks again, now all is clear for me :+1:

:sunglasses: