Tagged JSON with Jsonista (vs Transit)

Continuing the discussion from Slack: Jsonista 0.3.0 ships with support for tagged json, which allows one to write lossless encoding data using tagged values, like Transit, but with much better performance.

It looks like this:

(require '[jsonista.core :as j])
(require '[jsonista.tagged :as jt])

(def mapper
  (j/object-mapper
    {:encode-key-fn true
     :decode-key-fn true
     :modules [(jt/module
                 {:handlers
                  {Keyword {:tag "!kw"
                            :encode jt/encode-keyword
                            :decode keyword}
                   PersistentHashSet {:tag "!set"
                                      :encode jt/encode-collection
                                      :decode set}}})]}))

(-> {:system/status #{:status/good}}
    (j/write-value-as-string mapper)
    (doto prn)
    (j/read-value mapper))
; prints "{\"system/status\":[\"!set\",[[\"!kw\",\"status/good\"]]]}"
; => {:system/status #{:status/good}}

The question is: what should happen next? Many people seem to want “a faster transit” built on top of Jsonista, which would require:

  1. adding support for ClojureScript
  2. implementng tag-handlers for all basic Clojure/Java types
  3. adding support for tagged keys (now: just values)

Would be tempting just to do those, but

  • would it be better to contribute perf improvements to current Transit? Transit2?
  • Are there any non-clojure JSON tagging systems out there, which we could implement here and be more compatible with the rest of the world?

PS. @borkdude pointed out that JSON Schema draft 7 has support for non-json-data.

7 Likes

A human-readable version of transit would be great. Transit’s caching quickly makes HTTP responses difficult to read in Chrome DevTools for example.

Logging systems which expect JSON lines on stdout may also be able to handle the Jsonista format. It would be great to extract the EDN data back from these JSON log entries (during debugging etc.).

Either alternative option sounds much better – my biggest issue with transit is language support (swift currently and elixir a while back), and a competing library would be counterproductive in that regard. But performance hasn’t been an issue so far, so maybe i’m not the target audience.

(btw. is there a transit 2, or did you mean that would be necessary to contribute?)

@maxweber the json-verbose transit encoder is pretty readable (it’s transit without caching).

Today I used transit from Go for the first time, using https://github.com/russolsen/transit.
It worked fine. I’m using it to communicate from babashka to a sqlite lite binary written in Go:

However, this has me slightly worried:

NOTE: Transit is intended primarily as a wire protocol for transferring data between applications. If storing Transit data durably, readers and writers are expected to use the same version of Transit and you are responsible for migrating/transforming/re-storing that data when and if the transit format changes.

Source: https://github.com/cognitect/transit-java

If this means that upgrading either transit in babashka or in the go binary will cause compatibility issues, then maybe I should have written my own subset of transit using a tagged json approach.
However, this would mean I would have to re-invent a parser of the custom tagged JSON in all the languages that pods can be written in, which is exactly why I just went with transit.

That said, the spec hasn’t changed since 2014 so maybe it’s not really a big issue.

If I recall, a motivation for Transit (relative to edn) was to achieve a good compromise on speed between Clojure servers and ClojureScript clients. In that client-server partnership, the more limiting party was the browser.

If I understood properly, jsonista measured performance only on the Clojure side. Since Transit in Javascript uses the browser’s JSON reader, it might not afford precisely the same headroom for speedup as the Clojure end.

To wade in a little deeper, I wonder, if jsonista’s Jackson-Databind approach is faster than whatever way Transit encodes JSON, why not swap out just that part of Transit-in-Clojure?

1 Like