Improving error messages in Clojure (as a library)?

I don’t mean to come across as all bitter about it or anything, but with the release of the most recent Clojure survey I went back through the previous results, and it was galling to see error messages come across as the most frustrating thing about Clojure for three years in a row, and mentioned in every survey stretching all the way back to 2011 (though you occasionally need to dip into the raw data to find it). IMO the addition of specs for the core libraries in 1.9 has made error message output easier for machines to parse but harder for human beings to read, so we don’t seem to be going in the right direction here.

1 Like

Agreed, core specs can only help with type errors in args and there are many other types of errors that need to be explained clearly.

In the case of spec errors, Expound will support examples soon(ish):

(require '[clojure.core.specs.alpha])
(require '[clojure.spec.alpha :as s])
(require '[clojure.spec.test.alpha :as stest])
(require '[expound.alpha :as expound])
(set! s/*explain-out* (expound/custom-printer {:print-specs? false})) ; The "relevant specs" are very long, so let's omit them

(s/fdef foo :args (s/cat :x string?))
(defn foo [x]
  x)
  
(stest/instrument)

(foo :hello)
;;clojure.lang.ExceptionInfo: Call to #'user/foo did not conform to spec:
;;                            form-init8427266847984625578.clj:1
;;
;;                            -- Spec failed --------------------
;;
;;                            Function arguments
;;
;;                              (:hello)
;;                               ^^^^^^
;;
;;                            should satisfy
;;
;;                              string?
;;
;;
;;
;;                            -- Example ------------------------
;;                            (<f> "hello")
;;                            -------------------------
;;                            Detected 1 error

The important part is (<f> "hello"). Note that that the example includes <f> instead of foo because Spec doesn’t currently include the function name in the explain-data

If anyone is looking for a low-investment way to contribute to error message handling, one avenue is to find unhandled ClojureScript errors produced in maria.cloud and report them in our error handling wiki. Just break some code and add a heading and example of it to the wiki.

We work on the errors in bursts so it might be a while to get a particular message looked at. The errors will be ClojureScript-specific, but many specifics and most patterns are shared across Clojure dialects.

3 Likes

To be fair, making errors easier for computers to read is the right first step on the way to making them easier for people to read. If they have a predictable format we have a much better chance of writing code to present them well, plus we get other unrelated benefits around logging and other automated processing.

5 Likes

Would it make sense if Spec helped a little more than it does right now to produce human readable output?
E.g. when a predicate returns non-true and it’s not a simple type check, it would be nice to know what the semantics of that predicate exactly is.

1 Like

I’m not sure if I’m the only one - I love Spec, but when I see one of those error messages, my head and my eyes hurts. :sob:

1 Like

The issues with Clojure error messages are numerous and as I’ve said before, I think it’s important to try to tease them apart. Some of the things I have thought about for a long time:

  • Message context
    • Invalid values often propagate through the outer clojure.core layer into the JVM layer before they blow up. That code is often either very generic or lacking outer context. My experiments with clojure.core function specs significantly improve this - the error is caught at the first function layer and an error can be throw in the user’s call with location information, the bad predicate, the value, etc.
    • Macros obscuring context - given one or more layers of macro expansion an error sometimes occurs in expanded code, which exists neither in the user’s code nor in the macros
    • Missing language context - I see a class of complaints that are really about not having a good mental model for how some part of Clojure works. That’s not the user’s fault. Probably a relatively small set of these could be made gentler and make a big difference (like trying to invoke things that aren’t functions). In any case, having a means to integrate additional information would be useful. I don’t imagine that information would be “in core” but we could provide the hooks.
  • Conveyance
    • There is only one exception type that conveys error location (file/line/col) right now: Compiler$CompilerException. Forcing all location wrapping through this single inner class is confusing and at odds with the better support we now have in things like ex-data. Due to this, CompilerException is often used to wrap better errors, merely to provide location information.
    • There needs to be a distinction made between “your invocation is invalid Clojure code”, “your invocation produced an exception in your code”, “Clojure compiler error”, “Clojure runtime error”, etc. Maybe not exactly those, but a handful of broad categories. At the point of throw or rethrow, the general category is known and Clojure should help more here (so that tools can help more on the receiving end).
    • Code has a context (local bindings etc) and we know a lot of it. Conveying more of that could be put to great use.
  • Reporting
    • Due to the CompilerException wrapping above, there is often a chain being sent purely for location conveyance. Tools generally don’t unwrap these exceptions, but they should. Cursive for example shows you only the top error, the CompilerException, which has the location information, but usually a less useful error message and a useless stack trace. It is almost always more useful to look at the root cause exception - the message and stack are almost always what you actually want (but missing the location info!)
    • Unnecessary stack traces - some (maybe most) of the categories above don’t need actually need a stack trace at all. If you had the right info for a language error or spec error, you can just tell the user the thing they did and why it’s wrong (and maybe how to fix it).
    • Unnecessary frames - when stack traces are emitted, there is often a lot of noise and many people have taken stabs at simplifying. I find I hate most of the “simplified” stack trace printers but I have hope that there are built-in improvements to be made.
  • Spec
    • We are well aware that the core spec macro errors are intimidating and in some cases worse than prior behavior. (However, they also catch and fail on many previously accepted invalid inputs.) The spec macros like ns and defn have a) wide fanout (like the ns clauses), b) complicated structure (destructuring) and/or c) recursion (destructuring). In many ways, these represent the hardest possible cases for automatic error reporting. I think there are many many ways these messages can be greatly improved for these hard cases. I think people miss that the “easy” cases are really pretty good a lot of the time (since we don’t have the simpler core function specs included yet).
    • expound has done a great job building on the explain-data to go really far in making spec messages super friendly. Our goals in the core language differ somewhat from expound and while I do expect the default printer to get much better, I don’t expect core to end up where expound is now. In particular, there seems to be a lot of pressure for customizing errors and I actually think there is way more room to make the non-custom cases really good, and there is a ton of benefit in that.
    • I think there are still vast opportunities for tools to leverage specs and explain-data visually in novel ways beyond what expound can do. I think we can do more in spec in the future to provide some help here.

I’ve got some other writeups on this stuff and I probably forgot some things but that’s a good start.

34 Likes

@alexmiller Thanks for the excellent, detailed analysis.

My assumption is that specs will be the foundation for better error messages. But we haven’t been able to really confirm this since most of core remains un-specced. It may be true that spec errors (or libraries that build upon spec like Expound) still fail to generate good errors in common cases and more changes or features are necessary.

These specs are especially important for beginners since they won’t know how to spec their own functions initially. A larger set of specs would make a REPL like https://gist.github.com/bhb/2686b023d074ac052dbc21f12f324f18 much more useful.

There have been a few 3rd-party attempts to build a set of specs for core, but as you’ve noted, these are incomplete and buggy. Do you have any sense of when a broader set of specs may be released? From my perspective, I’d learn much more about error messages from an update to core.specs.alpha than from any particular spec bug fix (or even a major reimplementation of spec).

Thanks for all your hard work on spec and elsewhere!

3 Likes

spec is targeted most at describing information structures. In general, it works best when you use it to check concrete things about information. When you try to describe higher order or more abstract functions it gets harder and less useful to write specs. Many functions in core are relentlessly polymorphic and abstract (by design), so are actually not the sweet spot for spec. This is a tension that we have talked a lot about. I am not sure yet where this will lead, but until its resolved I don’t think there will be much progress on spec’ing core.

2 Likes

What if the focus was on speccing invalid input, instead of valid?

If its too hard to come up with the exact predicates that validate a core function, it could still be useful to have it more as a pre condition check for known invalid input.

That should limit the false positives, but still provide value to beginners potentially. Or help in obvious typos or fat finger scenarios.

Perhaps ironically I’m currently using spec to power the “definition side” of a validation library for a client project. Using s/or and similar to sort of classify/tag/label data is so nice to use internally. Also, I’ve yet to run into a situation where I’d need full spec error data leaving the library, I just use those to build readable exceptions for the configuration itself. Quite liking spec for that!

Thanks for the update! This is immensely helpful to know. Since specs for core aren’t likely coming soon, I’ll think about other approaches to help beginners navigate errors that occur when using core functions.

2 Likes

It seems awesome!

Do you know if this is possible to collect error message output into CIDER stacktrace window in emacs?

The new 1.10.0-alpha7 (and associated new releases of spec.alpha and core.specs.alpha) make a bunch of changes related to error messages. I would be curious to see if people have tried it yet and if so, what your experience has been.

Some highlights (not exhaustive):

  • Error categorization - we spent a lot of time on categorizing errors that occur during different “phases” - read, macroexpand, compile, eval, or print. Errors from the first 3 phases are syntax errors (found before your code is evaluated) and reported that way.
  • Errors during macroexpansion previously reported the compiler stack trace rather than the location in the user’s call where the error occurred. That has been changed.
  • Made many changes in reducing and reordering how info is printed in spec problem lines
  • Renamed many core spec paths and other changes to make core spec errors more informative

Some of the “category” differences can be seen here:

8 Likes

Great work! I especially like the renaming of core specs predicates, this makes it clear what’s wrong. Haven’t tested it yet though.

If you (or anyone) encounter an interesting case, I’d be happy to learn more about it. There are more changes we’ve considered and I’m sure more things we haven’t thought about yet.

1 Like

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.