Spec-tools 0.7.0

We just released spec-tools 0.7.0. Spec-tools is a utility library on top of clojure.spec, adding some things we have needed but are not part of the core library. Mostly a playground for ideas, both good and bad :wink: These include:

  • extendabe specs and spec meta-data (via Spec Records)
  • spec value transformations (e.g. coecion)
  • spec visitors
  • data-specs
  • JSON Schema genration
  • Swagger2 Schema generation

New version introduces spec-driven and two-way transformations and contains some bug fixes.

More details in a post: https://www.metosin.fi/blog/spec-transformers/

example of round-tripping inst? value from JSON and back:

(require '[spec-tools.core :as st])

(as-> "2014-02-18T18:25:37Z" $
      (doto $ prn)
      (st/decode inst? $ st/json-transformer)
      (doto $ prn)
      (st/encode inst? $ st/json-transformer)
      (doto $ prn)
      (st/decode inst? $ st/json-transformer)
      (prn $))
; "2014-02-18T18:25:37Z"
; #inst "2014-02-18T18:25:37.000-00:00"
; "2014-02-18T18:25:37.000+0000"
; #inst "2014-02-18T18:25:37.000-00:00"
10 Likes

Reposting here from reddit, since I saw the article said comments should be here.

And first, I want to say I really like the work done on spec.tools. And even though I’m going to challenge it here, it’s not because I think it’s a bad idea, but just to be sure it’s a good idea. Also, I like that the coercion is no longercalled conform, because that was really confusing, since conform isn’t supposed to coerce values.

Transforming specced values is not in scope of clojure.spec, but it should.

Did I miss the arguments for this?

I don’t see why you’d want to make coercion and structure to structure transformation some complicated declarative spec mini DSL, instead of simply just using a normal function to convert it?

Say I get some JSON, just do a JSON->MySpec fn, and you’re done. If you let anything “automatically” handle the transform, you risk being surprised of the result, and causing hidden bugs.

What am I missing?

As another way to look at it, weakly typed languages like JavaScript which do automatic implicit coercion are generally known to be more error prone. That’s why both Clojure and ClojureScript are both strongly typed. So all coercion must be explicit. Why would this not hold true at the boundaries? And thus for spec?

Also, I want to add that conform is not for coercing or transforming. Its for parsing. What conform does is return to you an AST of the data where data elements are tagged so you can more easily interpret the data. The data values are untouched, never coerced or transformed.

3 Likes

This blog post about plumatic/schema’s type coercion features was pretty interesting on this topic. I imagine the motivation is similar with spec. The upshot of it is: your schema already contains all the information you need to do a coercion, so why effectively duplicate the schema in a conversion function?

1 Like

I agree with your summary, and think it’s interesting to compare to the spec philosophy. The the article you linked summarizes the plumatic/schema approach like this (all emphasis mine):

Spending time writing the boilerplate conversions became exceedingly inefficient, so we decided to do something about it. If schemas have all the information needed to do what we want, then we should be able to solve this problem once and for all and do away with the boilerplate. And that’s what we’re delivering out of the box in Schema 0.2.0: a completely automated, safe way to coerce JSON and query params using just your existing Schemas, with no additional code required.

This is driven by this explicit statement of that tool’s philosophy:

Schema’s design goal: enabling a single declarative definition of your data’s shape that drives everything you want to do with your data, without writing a single line of traversal code.

The thing is, however, that this is explicitly not spec’s design goal. The spec-tools article that this thread is about mentions this:

Transforming specced values is not in scope of clojure.spec, but it should [be].

Correct me if I’m wrong, but it seems from this statement that spec-tools’s creators think that the spec philosophy should be like schema’s: you define your desired data format once and that definition is used for everything.

But the spec team has been vocal that this is not their aim. The clearest statement of their argument against coercion that I’ve found is in a mailing list thread that the OP article links to, where Alex Miller says:

spec-tools combines specs for your desired output with a coercion function, making the spec of the actual data implicit.

Note the difference between describing what data is with describing what the data should become.

This next part is me going out on a limb a bit. What helps me make sense of Alex’s statement, and of the spec philosophy in general, is to swap out the name “spec” for the term that comes from its prior art: contracts. A spec is, philosophically speaking, a contract for what the data must be. It’s an agreement between data source and data consumer. If the data does not satisfy the contract then something is wrong: the agreement has been broken, you should flag this violation and find out what went wrong. You don’t automatically massage it into some more desired form, because that introduces inherently slippery boundaries about what the data needs to be.

I agree with @didibus that automatic coercion is like bringing in JavaScript’s WAT-worthy type semantics, which is just about as far from a data contract as programming languages get.

Another aspect that occurs to me is that the problems spec is intended to solve does not include data transformation. It’s easy to see why: Clojure does not lack for efficient and expressive data transformation capabilities. That’s Clojure’s bread and butter—spec doesn’t need to “fix” data transformation. So why bundle that into spec, if doing so weakens or at least distracts from spec’s intended uses?

That’s why I find solutions like Sean Corfield’s approach so compelling:

My recommendation is to have a strictly non-coercive spec for the target data “type” / shape you want, and to have a second spec that combines the coercion you want with that spec. That way you have a way to tell if your uncoerced data conforms to the spec, as well as a way to do coercion in s/conform. They are – and should be – two separate specs and two separate operations. They represent different layers of abstraction inside your application (so “of course” they should be two separate specs, one built on top of the other).

(Personally, I see a role for pure-Clojure transformation functions between those layers of specs.) The spec-tools article has this to say about this approach:

Runtime transformations are out of scope of clojure.spec, and the current best practice is to use normal Clojure functions first to transform the data from external formats into correct format and then validate it using the Spec. This is really a bad idea, as the structure of the data needs to be copied from specs into custom transformation functions.

To me, this seems unconvincing in the face of Sean’s point, which is that there is no “the structure of the data” because the data has multiple structures and needs to be treated as such.

3 Likes

@jmlsf excellent post you linked. Schema excels with coercion, which is based on a protocol-based walker, enabling both fast transformation and validation in one step. To have a similar generic walker for spec, there is the CLJ-2251. The core team is not accepting patches for now.

Why transformations? Having transformations derived from structure is the norm in most languages & solutions already. If your Clojure application contains just few exposed models/specs that don’t change and you have only one external format to support (e.g. JSON), it might be ok to have custom functions to do the transformations. If you have a large app with tens - or hundreds - of models and they are evolving, it’s a big and error-prone effort to develop and maintain the custom transformation for all models (for all formats) manually.

Looking out of the box (and to the potential new Clojure users!) - how do the Java-developers do the Object to JSON-transformations? They most likely use Jackson Databind behind the scenes like we do, but as the Java Classes have already the type information, Jackson will serialize and deserialize the deeply nested objects automatically based on the registered type-transformers. Refactoring is easy as you just change the Classes and transformations follow. A good intro to Jackson is found here. Jackson was initially released in 2009, so this is nothing new.

Currently, I believe the only complete way to walk the specs is the s/conform*. It’s not meant for coercion, but then again, what is? For simple things, we could just parse the spec s/form (like spec-tools does, for other purposes and spec-coerce uses for transformations), but making it complete (e.g. the regex specs) is another story. Will the s/form be the only public api? will there be a spec walker in the core? if, when?

I love Clojure and would like to see that clojure.spec becomes the one data specification tool we need in the future, both for development time and for runtime.

3 Likes

Responding largely to @dave.liepmann: For me, this is just a pragmatic issue, not a philosophical one. Before I figured out how to do coercions with spec, I had functions that walked various parts of my data in ad hoc ways, doing simple things like converting date strings into moment.js objects and converting number keywords into numbers. With spec, you can say (in 3-4 lines of code), “run moment on this piece of data if you are expecting a Moment type”. You can do that just once for your entire code base. That’s powerful and concise.

I fail to see how this leads down the path of WAT-worthy javascript type coercions, which, after all, only cause one to exclaim “WAT” because they are unexpected and often not what you want. Here, I already know I want the conversions, and I’m going to have to make them one way or the other, so the only question is, what library makes it easy for me to do that?

Sean Corfield’s approach, if I understand it correctly, would be to write two specs: one for the incoming JSON and one for the transformed JSON. And then I think you are suggesting that then I would have to write a pure clojure functions to transform between the two, a function which inevitably would re-encode much of the structure of the data in yet a third place.

Why not do this? Because it’s a lot of work and doesn’t actually solve a problem I have. Runtime type checking is about establishing a limited set of invariants in your code at strategic points in a way that is cost effective for my limited time and mental capacity. I don’t need to establish invariants on the pre-coercion data, then coerce, then run virtually identical invariants again. And I definitely don’t need to do that if it requires a huge amount of work. This is why the schema approach is pretty good for me and why so far I feel that the spec authors are trying to solve different problems than me.

1 Like

Just to add my 2 cents.

I’m use spec-tools a lot, specifically for coercion.

It is really nice just having to specify your data structure once, and then knowing that when you deal with your data it is in the correct format.

This works especially well when you get to more complex specs using for example multi-spec.

I don’t see the similarity to JavaScript implicit conversion at all - at the end of the day your data ends up in a well-defined, validated state when you coerce it. Nothing implicit about it.

The only difference to using explicit coercion functions is that you are mostly specifying your coercion using data instead of functions, which I personally find better.

However, I see the point for separating coercion from validation, as several times I’ve needed data to be coerced without having it validated.

To me the main issue is that spec seems to be trying to do too much, but then also too little.

What I mean by this is that spec provides us with a great data definition language, which is awesome.

But then it says we can use that data definition for a few use specific use cases - validation, generation and testing - but nothing else.

That limits this great data definition language to a few limited uses, and I don’t understand why.

The more I use it, the more I realize that what I would have liked to spec to be is simply a well defined data definition language, which generates easily queryable and transformable data definitions. And then all the other functionality - like validation and generation - added on as separate libraries.

As it is, spec is awesome and I really like the key based specification. However, I am worried that by trying to do too much, it has actually limited its usefulness.

4 Likes

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.