Clojure Spec: Instrumenting functions

didibus · August 12, 2018, 8:00pm

It’s best to understand what spec gives you.

A concise and formal declarative DSL for describing data in structure and value.

This can serve as documentation for your data, which both people, as well as machines can understand.

Automated data validation from a declared spec.

Given some data, and a spec, Spec can tell you if the data is valid as per the given spec. You can use this inside your app when you need validation. This includes user input. The functions for this are in the main spec namespace, valid, assert, explain, etc.

The official guide covers this well https://clojure.org/guides/spec#_using_spec_for_validation

Automated data conformance from a declared spec.

Given data which can be of many shapes, and a spec for it, spec can tell you which one of the many possible shapes the given data conforms too. This can be useful for interpreting data and parsing data.

Automated data generation from a declared spec.

Given a spec, you can generate valid data for it. This can be done in a way which grows and shrink, so that the generated data covers the full spectrum of the distribution of possible values for the spec. This can be useful for testing and understanding a given spec by being able to see example data for it.

A formal and declarative DSL for describing function’s input, output and the relationship between them.

Spec allows you to describe not just data, but also functions, using the same declarative formal DSL.

This can serve for enhanced documentation. Or to validate input, output or the relationship between input and output. Or to conform input or output of a function. Or to generate example input or output for tests or to better understand a function through examples.

Running generative testing on your specified functions.

Given a function, and a spec, it can try X number of generated inputs from the input spec on the given function, and assert that the returned output is valid to the outout spec, and that the relationship between input and output holds true.

This can be used for performing broad coverage unit tests.

Automated instrumentation of your specified functions, so that they are extended to validate their input prior to executing their body.

Given a symbol or symbols pointing to vars containing functions, it can instrument the functions pointed to so that they validate their inputs prior to executing their body. You can also have spec instrument all specified functions, instead of specific ones.

This is useful for development, to make sure you’re composing your functions in valid ways, and can also be useful in some tests.

With all this, I hope you get a better idea of the different features offered by spec ans their primary use cases.

With that said, you wondered what fns to instrument?

You should instrument all the ones that are worth having a spec for, but only at development and testing time.

You can still instrument things in production, but it’s not recommended, for two reasons. First, it sounds like you havn’t tested your cpde enough, that you believe you still have bugs in the way you composed your functions together for which you want to validate. Second, it will have a considerable performance impact.

There are some functions though for which you don’t control the input range. User input or data from a file, or DB, are such example. For those, use the spec validation functions lile valid, explain, assert, etc. Why not use instrument for these? Because that validation logic isn’t for testing, is actual logic which must be part of your end program, and you want it to be explicit in your code. During a CR, people should see that you are validating these and handling failures to validate in a meaningful way. Instrument is an implicit way to add validation, that is hidden, and where you can not handle the failures in customized way appropriate for the use case. Its meant for testing while in development.

That said, you can still do it if you want. I know some people instrument in prod also. If you don’t mind the performance hit, and you know you’ve tested your code properly, and you’ve added explicit validation in places where you take external input, like user input, then you can leave instrumentation on in prod also.