Clojure Spec: Instrumenting functions

clojure_spec

#1

Hi!

As I am making my way through spec I’ve ran into question: How do you instrument functions?

Basically, as far as I know, there are two ways:

  1. fdef
(spec/fdef get-user
  :args (spec/cat :logic-config ::logic-config :user-name ::user-name)
  :ret ::user)

(defn get-user
  "Get user `USER-NAME` from `LOGIC-CONFIG`"
  [logic-config user-name]
  (get-in logic-config [:users user-name]))
(stest/instrument `get-user)
  1. :pre/:post
(defn get-user
  "Get user `USER-NAME` from `LOGIC-CONFIG`"
  [logic-config user-name]
  {:pre [(spec/valid? ::user-name user-name)
           (spec/valid? ::logic-config logic-config)]
     :post [(spec/valid? ::user %)]}
  (get-in logic-config [:users user-name]))

And here are my concerns:

  1. fdef has way better error messages.
  2. :pre/:post is easier to use regularly, requires less code.
  3. Writing fdef + stest/instrument for every function seems a bit like overkill, especially compared to simple :pre/:post, assuming that :spec for every argument is already there and fdef will just map it to function in separate place.
  4. However, giving up on spec checking in very basic functions because fdef is inconvenient to use seems wrong - it’s better to use :pre/:post than nothing. But then one ends up with inconsistent mix of fdef and :pre/:post (and with inconsistent error messages).

It seems like all those issues can be addressed by rather simple macro over defn just incorporating fdef in it - but I’d rather avoid creating my own language.

What is your approach? You stick only to fdef? Or pre/post? Maybe you instrument only public functions and assume that’s enough? Or fdef for defn, and :pre/:post in defn-?

Is there any recommended approach? Or it’s all because of alpha in clojure.spec.alpha and it’s how it is right now?

Thanks,
Slawek

PS: I made small edit due to my misunderstanding.


#2

Shameless plug: https://github.com/j-cr/speck

Also notice you don’t need to call instrument for every function, you just need to call it once to instrument all the vars. And you probably want to use orchestra instead of vanilla stest.


#3

Considering spec is still alpha, I am afraid to invest into wrapper library like speck. I don’t know what are the plans for the future, but I assume that spec will be more tightly integrated with Clojure core.

What I am more curious about is people’s approach to spec in their projects, but utilizing only tools available in clojure.*. In other words: what’s current „good practice” for function spec in pure Clojure (without plugins).


#4

Speck is specifically not a wrapper over defn, it’s just metadata on normally defned functions.


#5

That’s true, of course. But still, this is something out of standard and may become obsolete soon (or not - only Clojure authors now).


#6

Here’s my understanding of the best practices from the perspective of the core team.

You shouldn’t spec all your functions. Only spec the pure ones that are hard to test, and you want a good generative test run on them. And spec the ones that manipulate your entities, and you want good documentation on the input and what must be included and all that.

For the above, use fdef. For the pure ones, setup generative unit tests. For the others, you only want to instrument them in development, not in production.

If there are invariants that you want to validate in production, then you can check those inside pre/post predicates, or even sprinkle asserts/validates within your functions, wherever you need to check the invariant.

At least, that’s my impression of the core team’s vision on how to use spec.


#7

Speck doesn’t use any of the clojure.spec internals (I believe Alex Miller has stated that alpha == those may change at any time), so if for some reason speck would break, then all your fdefs will break too; after all, speck is just that - a thin wrapper around fdef. You still write the same specs you would write with vanilla fdef.

As for the original question, I would say you definitely should prefer fdef to :pre/:post. If you find fdefs too heavy to read, consider placing them not right next to the defns you’re specing, but e.g. in the beginning of your file. So first you define interface, then the implementation. Also, this way you probably end up specing only the public (and maybe some important/complex) functions, because using fdef for documentation purposes, i.e. to describe the dataflow of values through all of your functions produces just too much noise. I think the main strength of fdef is exactly that is allows you to decouple (both in location and in time) the contract from the implementation, and I believe that’s what it was designed for in the first place.


#8

It’s hard for me to agree - because you take spec and generators as one - while I feel like it’s perfectly fine to use spec only without generators (I am new here, but it seems like in some cases writing generators will cost too much effort to be actually useful).

Exactly - just like declaration vs definition in C.

So my current rule-of-thumb for spec is:

  1. Focus fdef on functions where from input comes, for example: public interface of library, functions taking input from user, fetching data from the Internet, reading it from hard drive. Eventually, spec internal functions where you suspect programmer error or confusion. Also, fdef places of critical actions/processing, or with side effects where invalid input data may result in inconsistent state.
  2. Avoid using :pre/:post, unless speck or some-time-in-future when defn will includespec support in metadata.
  3. If input point is specd, assume data is valid. No need to check twice, write your functions only for the correct data. Still write tests to ensure that functions work as expected. You can use generators to mock input.

What do you think?


#9

I think that sounds good.

I’d notice however that I completely agree with the post by didibus that you quoted - one of the main points of spec is its ability to generate the generators (how meta, lol), so, yeah. If your spec is specific enough, then you don’t need to write any custom generators, it just works. On the other side, sometimes I just declare specs as any? and use it solely for documentation (with the prospect of maybe making it more specific later), but as I’ve said, with vanilla fdef it’s too much noise.

Also, one important point that I think you might be missing (or maybe not) judging from the first item in your list: fdef is instrumenting functions, your app’s behavior should NOT change whether instrumenting is turned on or not. So you can’t verify that user input is valid with fdef. If you’re a library author, and your “user” is another programmer, then yes, you can validate e.g. inputs to your macro with fdef; but that would happen at DEV time, not RUN time. Don’t mix the logic of your app (including data validation) with the instrumentation. Of course, you can (try to) use the same entity (i.e. data) specs for both instrumentation and user input validation.


#10

I wish pre and post supported custom messages. Sigh, https://dev.clojure.org/jira/browse/CLJ-1817 for your voting interest.


#11

I had no idea!

And everything begin to make sense now. Especially term instrument finally seems properly related to testing.

I have an idea of my updated rule-of-thumb for spec, but I’d like to hear from you first.

What functions to instrument then?

Now it seems like I should only instrument critical functions to prevent major failures and use :pre/post in less-but-still important places in order to limit magnitude of destruction if something goes wrong. But still, this should not be used for end-user (like, GUI) data verification - in such cases explicate error handling together with meaningful error messages should be involved.

This seems like there will be only a few instrumented functions, maybe a bit more ;pre/post.


#12

What functions to instrument then?

Well I think didibus put it nicely (pure ones that benefit from generative tests + anything you want a good doc on). Also if you’re writing anything that will be used by another dev (or yourself in a couple months) and has a somewhat clear boundary - try to spec the “public api” of that module. And as I’ve said, if you’re using something like speck, then you can add (maybe stub) specs to just about everything. Also, let’s say you’ve found that one of your functions has a bug, but you’re not sure where exactly down the callstack things go wrong. Like, I don’t know, maybe some :count field goes negative for example. Well, you can just fire up a debugger, fix a bug on this particular code path and call it a day (and then a week later stumble upon another incarnation of the same bug and start all over again). Or you can write a bunch of example-based unit tests for functions down the callstack (possibly lots of them), so the bug is properly documented and regression testing is happening, etc etc, right? OR you can write a single integration test for the top-level function PLUS add some specs to the functions down the callstack, and voilà, when you run the integration test it will tell you exactly where things gone wrong. So you get (almost) all of the benefits of individual unit tests without having to actually write them, plus as a bonus you get a nicer docs, declarative dataflow description and free generative tests for some of those functions! Well, you get the idea. Of course, there’re some caveats (good specs aren’t always easy to write, sometimes you still want the particular failing example in your tests (though often you can generate/shrink to that and then add it to the unit tests), etc), and of course it’s not a silver bullet, but it’s certainly and improvement worth trying.

For end-user data verification, you can try to use s/valid? with your own logic of handling a mismatch. :pre/:post or, better yet, s/assert are probably a good fit for your models, e.g. when you want to be sure that bad data don’t go into the db.

I guess that’s the general ideas, maybe others would add something else?


#13

It’s best to understand what spec gives you.

  1. A concise and formal declarative DSL for describing data in structure and value.

This can serve as documentation for your data, which both people, as well as machines can understand.

  1. Automated data validation from a declared spec.

Given some data, and a spec, Spec can tell you if the data is valid as per the given spec. You can use this inside your app when you need validation. This includes user input. The functions for this are in the main spec namespace, valid, assert, explain, etc.

The official guide covers this well https://clojure.org/guides/spec#_using_spec_for_validation

  1. Automated data conformance from a declared spec.

Given data which can be of many shapes, and a spec for it, spec can tell you which one of the many possible shapes the given data conforms too. This can be useful for interpreting data and parsing data.

  1. Automated data generation from a declared spec.

Given a spec, you can generate valid data for it. This can be done in a way which grows and shrink, so that the generated data covers the full spectrum of the distribution of possible values for the spec. This can be useful for testing and understanding a given spec by being able to see example data for it.

  1. A formal and declarative DSL for describing function’s input, output and the relationship between them.

Spec allows you to describe not just data, but also functions, using the same declarative formal DSL.

This can serve for enhanced documentation. Or to validate input, output or the relationship between input and output. Or to conform input or output of a function. Or to generate example input or output for tests or to better understand a function through examples.

  1. Running generative testing on your specified functions.

Given a function, and a spec, it can try X number of generated inputs from the input spec on the given function, and assert that the returned output is valid to the outout spec, and that the relationship between input and output holds true.

This can be used for performing broad coverage unit tests.

  1. Automated instrumentation of your specified functions, so that they are extended to validate their input prior to executing their body.

Given a symbol or symbols pointing to vars containing functions, it can instrument the functions pointed to so that they validate their inputs prior to executing their body. You can also have spec instrument all specified functions, instead of specific ones.

This is useful for development, to make sure you’re composing your functions in valid ways, and can also be useful in some tests.

With all this, I hope you get a better idea of the different features offered by spec ans their primary use cases.

With that said, you wondered what fns to instrument?

You should instrument all the ones that are worth having a spec for, but only at development and testing time.

You can still instrument things in production, but it’s not recommended, for two reasons. First, it sounds like you havn’t tested your cpde enough, that you believe you still have bugs in the way you composed your functions together for which you want to validate. Second, it will have a considerable performance impact.

There are some functions though for which you don’t control the input range. User input or data from a file, or DB, are such example. For those, use the spec validation functions lile valid, explain, assert, etc. Why not use instrument for these? Because that validation logic isn’t for testing, is actual logic which must be part of your end program, and you want it to be explicit in your code. During a CR, people should see that you are validating these and handling failures to validate in a meaningful way. Instrument is an implicit way to add validation, that is hidden, and where you can not handle the failures in customized way appropriate for the use case. Its meant for testing while in development.

That said, you can still do it if you want. I know some people instrument in prod also. If you don’t mind the performance hit, and you know you’ve tested your code properly, and you’ve added explicit validation in places where you take external input, like user input, then you can leave instrumentation on in prod also.


Clojure data into JSON and back with spec and Liberator
#14

Thank you for all your replies! Now I have at least an idea what to instrument.

That’s hopefully my last question. Am I supposed to perform automatic generative checks of instrumented functions (that is, based on fdef) in test namespace? So lein test will also perform something similar to c.spec.test.alpha/check?

If so, how am I supposed to do this?

My best try was something like:

(stest/summarize-results (stest/check `sut/translate {:clojure.spec.test.check/opts {:num-tests 15}}))

but it failed:

ERROR in (auto-generator-test) (FutureTask.java:122)
Uncaught exception, not in assertion.
expected: nil
  actual: java.util.concurrent.ExecutionException: java.lang.ClassCastException: clojure.lang.AFunction$1 cannot be cast to clojure.lang.MultiFn, compiling:(clojure/test/check/clojure_test.cljc:95:1)
 at java.util.concurrent.FutureTask.report (FutureTask.java:122)
    java.util.concurrent.FutureTask.get (FutureTask.java:192)
    clojure.core$deref_future.invokeStatic (core.clj:2292)
    clojure.core$future_call$reify__8097.deref (core.clj:6894)
    clojure.core$deref.invokeStatic (core.clj:2312)
    clojure.core$deref.invoke (core.clj:2298)
    clojure.core$map$fn__5587.invoke (core.clj:2747)

#15

You did include [org.clojure/test.check "0.10.0-alpha3"] in your dev dependencies, right? I’m not sure where this exception come from, it works for me…

Honestly, the whole spec/test integration is a mess though. You have to do something like this to simply run a test:

(is
 (not
  (:check-failed
   (s.test/summarize-results
    (s.test/check
     (s.test/enumerate-namespace 'your.ns)
     {:clojure.spec.test.check/opts {:num-tests 15}})))))

This is horrible, right? I don’t understand why they refuse to add any spec/test integration, this should work out of the box.

See this gist for a workaround though: https://gist.github.com/kennyjwilli/8bf30478b8a2762d2d09baabc17e2f10#gistcomment-2585384

Oh, actually there’s a link to the issue about your exception there too: https://github.com/technomancy/leiningen/issues/2173


#16

I agree, spec+check and clojure.test integration is a mess. There is test.chuck though that provides some helpful integration but we still have nothing for spec.test/check


#17

I second the recommendation for test.chuck - the test.check integration with clojure.test is excellent.

FWIW, Expound can print the results from check, which may be useful in understanding the output: https://github.com/bhb/expound#printing-results-for-check