What about Clojure can be improved? [Not CLJS]

TristeFigure · October 22, 2019, 5:24am

Allow variadic arguments to be on the left when destructuring.

(let [[more & b a] [:c :b :a]]
  [more b a]) ;; => [[:c] :b :a]

Of course there is something wrong with this syntax since [a & b] is ambiguous. Maybe:

(let [[more <& b a]...

I’m not convinced.

(let [[more ⅋ b a]...

There is also

(let [[more ‹ b a]...

(Alt + W on my azerty mac keyboard)

didibus · October 22, 2019, 4:12pm

Not quite destructuring, but if you really need to do this, you can use Spec’s regex OPs to do it:

(s/conform
 (s/cat :more (s/+ any?) :b any? :c any?)
 [1 2 3 4 5])

;; => {:more [1 2 3], :b 4, :c 5}

TristeFigure · October 22, 2019, 9:14pm

Ruby supports it (2.3.7p456).

irb(main):006:0> ->(a, *more) { [a, more] }.call(1, 2, 3)
=> [1, [2, 3]]
irb(main):007:0> ->(*more, c) { [more, c] }.call(1, 2, 3)
=> [[1, 2], 3]

Also Ruby doesn’t have the dilemma of “flat” vs regular maps for named parameters ((func :key value) vs (func {:key value}))

They are interchangeable.

irb(main):002:0> ->(hash) { hash }.call({abc: :xyz})
=> {:abc=>:xyz}
irb(main):003:0> ->(hash) { hash }.call(abc: :xyz)
=> {:abc=>:xyz}

This works particularly well in tail position with other regular arguments

irb(main):004:0> ->(x, hash) { [x, hash] }.call(1, abc: :xyz)
=> [1, {:abc=>:xyz}]
irb(main):005:0> ->(x, hash) { [x, hash] }.call(1, {abc: :xyz})
=> [1, {:abc=>:xyz}]

But doesn’t work with arrays

irb(main):008:0> ->(array) { array }.call([1, 2, 3])
=> [1, 2, 3]
irb(main):009:0> ->(array) { array }.call(1, 2, 3)
ArgumentError: wrong number of arguments (given 3, expected 1)
	from (irb):9:in `block in irb_binding'
	from (irb):9
	from /usr/bin/irb:11:in `<main>'
irb(main):010:0>

What else can I add ?

Ruby has deep destructuring capabilities

irb(main):005:0> ->((a, (b, *more))) { [a, b, more] }.call ([1, [2, 3]])
=> [1, 2, [3]]

and supports default values

irb(main):006:0> ->(a=1, b=(1 + 1)) { [a, b] }.call(0)
=> [0, 2]

joinr · October 23, 2019, 6:22am

Pretty sure we already have that

user> (let [[x y & [a b c & xs]] [1 2 3 4 5 6 7]] (list x y a b c xs))
(1 2 3 4 5 (6 7))

Am I missing something?

didibus · October 23, 2019, 7:21am

I don’t actually know Ruby so I have two questions.

To be honest, I don’t really see the point of left side variadic destructuring. Can you give some example usage pattern for it? I only use right side variadic if the function can take an unknown number of additional args, so that makes sense on the right, but not sure what having it on the left would be useful for?
When you say Ruby treats “flat” and normal maps the same, at first I felt that would actually be a problem, what if I don’t want to treat it as a map when flat? But looking at.your code, it seems like the “flat” map is actually one argument, since you don’t have a comma between the key and value. Is that not true? Or are commas in Ruby between arguments optional?

For your other examples, if I understand correctly, Clojure has that as well. You can do deeply nested destructuring and have default values as well. Is anything about it lacking from Clojure that Ruby has?

Thanks

linpengcheng · October 23, 2019, 12:01pm

I think the language core should be kept simple and intuitive, if there are special grammatical requirements, if not necessary, with library implementation, do not make grammar difficult to read.

If there are distributions like Racket, R, Python, with a large number of standard libraries, commonly used open source libraries, package managers, openjdk, maven, simple gui consoles, etc., it is also helpful to make getting started less difficult and making use more convenient.

joinr · October 24, 2019, 4:54am

Low hanging fruit

A recent exploration into performance tuning illuminated some things that seem approachable.
The clojure compiler could be doing a lot more to help tweak code. Something similar to common lip’s declarations for optimization, hinting, etc.

hinted destructuring

One area that leads to “death by a thousand cuts” is map destructuring and other destructuring. These are such idiomatic choices and encouraged by the language, but they bring in overhead due to defaulting to the polymoyphic get/nth accesses that are emitted. You end up having to unroll your previously idiomatic, nicely destructured code into something bulkier and arguably harder to maintain. recent performance tweaking in response to Nikita’s post on Rust seemingly trouncing clojure shows that we can do better with a macro that allows the user to intelligently inject type hints into destructuring binds to allow the compiler (in this case the macro) to smartly emit faster direct method implementations (e.g. supporting direct record/type field access, .nth for indexed types, and .get for maps). This is a naive implementation that requires more user intervention, but one that yielded significant results in practice.

Given the prolific nature of destructuring inside small functions the end up on hot paths (say inside a reduce or map or whatever), even something as simple as allowing for hinting on whether an arg supports Indexed operations, this is likely a simple but useful win. In my naive macro, I also have the option of emitting warnings for users when slow calls for get/nth are emitted (along with suggestions for hinting).

Structual Hashing of Records

This ended up being a drag on performance, since there was a lookup table built on “point” records (x,y coords). The original implementation spent a ton of time just hashing points and comparing. Switching to a custom point type with its own hashcode worked out. No idea if this can be improved in the general, but it was a non-obvious weakspot that emerged in profiling. I could maybe see some smarter defrecord logic that detects if the static fields are numeric (e.g. a cartesian point or other representation) and emits a default “fast” structural hash based on the numeric fields (assuming the non-static entry map is empty).

Field Access for Records and valAt

The defrecord implementation that emits a case form that dispatches based on they key provided to determine quickly if there’s a static field being looked for, returning the val in constant time if so. This is slower than just invoking direct field access on the record, but faster than paying the price of hashing the key (since keywords can be checked via identity fast). Plumbing this further…for “small” records, where “small” implies a set of static fields <=8, it actually appears faster to use condp identical? in lieu of the case implementation, since the constants appear to favor a small linear sequence of identical? calls rather than identity-based hash lookup that case seems to emit.

Less Lower Hanging Fruit

Interpreter for use with Graal for Unhobbled Native Image Stuff

Eval is currently the bane of AOT native compilation. There is no infrastructure in place to opt in to low-performance interpret that can be used with Graal while retaining access to all of Clojure’s features.

There was a master’s thesis on truffle clojure published a while back that provides a non-trivial implementation of Clojure in the truffle language framework (e.g. java-based AST definitions). No repo was ever published, but there is code inline in the thesis. This seems like a compelling option for bootstrapping something compatible with Graal’s native-image. I think if we had a truffle implementation, you’d get an interpreter (plus the JIT optimizing compiler) all in one. No idea on how much pain/incompatibility this would introduce, but it seems like a way to broaden the scope of applicable native-image apps.

Babashka kind of does this, but it intentionally provides a very limited set of interpretable Clojure.

Smaller, more portable implementation core

Thanks to the work done on ClojureScript’s largely protocol-based clojure.core, tools.reader, and tools.analyzer, we have pretty much everything needed to bootstrap a clojure implementation in a new host fairly quickly. Assuming you can get the pre-reqs for reading and evaluating these libraries into place in your proto-clojure, you should be able to bootstrap Clojure relatively easily (for various definitions of “easy”). I’ve been experimenting off-and-on with doing that with Common Lisp over the years as time permits and learning allows (also as the aforementioned resources came into being), and it’s looking like a viable way to port clojure to new hosts - some which, like CL, address problems like native-image out of the box (and allow for more advanced compilation features, like SBCL’s type inference engine and fine-grained performance options).

One thing that’s obvious during the porting process (I think this was learned during the CLJS port), is that there’s likely a very simple lower-lisp that we could reduce Clojure into. cljs.core provides most of clojure (assuming you have functions, protocols, types, etc.). It’d be interesting as a research project to distill Clojure into an implementation based on one or more of these simpler lisps and provide a minimal substrate for the host to implement for bootstrapping. tools.reader and the analyzer could similarly be distilled, to provide a fast way to get environments setup. I could envision similar ports to hosts with really nice ecosystems like Julia and any of the Schemes.

So, perhaps spending some brain power making clojure simpler to implement would be a “good thing” for extending reach (just as CLJS did) to new hosts where it makes sense.

staypufd · October 24, 2019, 10:38pm

I was alluding to things that you could do on the TI Explorer Lisp Machine and in Xerox InterLisp. I have looked at Rebel-readline and REBL. They are both quite nice. I just want to see it taken further. I’ll for sure take a look at the videos you linked to. Thanks for sharing.

TristeFigure · October 25, 2019, 8:07am

“To be honest, I don’t really see the point of left side variadic destructuring.”
A (mini) parser function for instance. Say it takes as arguments a variadic number of tokens that the function consumes one by one before recursing. It either recurses from the left or from the right. In the latter case you need to get the last element in the args vector before getting to the rest.
“looking at.your code, it seems like the “flat” map is actually one argument”.
And you’re right. Commas are mandatory in ruby. In hashes, they occur between key-value pairs. One way to write it is :key => value and another way, introduced later in the development of the language is key: value (the dots come at the end of the key symbol an replace the associative arrow). So indeed, what look like flat maps are actually a stealth notation for hashmaps. Nevertheless it resolves the dilemma among programmers in practice, whereas in Clojure by contrast you still have to wonder how to pass those arguments, flat or in a hash ? This is because Clojure can’t make the distinction between a sequence of values and a flat map on its own. Now what would happen if we were to introduce this notation in Clojure ?

We’d introduce complexity in the notation. I can confirm the presence of these two different ways to write keywords (they are called symbols in Ruby) is disorienting at first, especially when you come from a Clojure bakground. Typo-like disorienting.

However you end up realizing this is what allows Ruby to “solve” this flat VS explicit map dilemma between programmers. That would requires us to introduce a ‘key: value’ notion for MapEntries/Maps.

And maybe would we be able to splice maps into maps.

And I was just presenting Ruby’s argument & destructuring systems because I think they are on par with Clojure’s, which I know well since i have rewritten it a couple times, notably to make map destructuring support the & operator.

linpengcheng · October 25, 2019, 12:35pm

@TristeFigure : you need pop and peek function.

http://clojuredocs.org/clojure.core/pop

(let [v     [0 1 2 3 4 5] 
      [h t] [(pop v) (peek v)]] 
  [v h t])
;=>
;[[0 1 2 3 4 5] 
; [0 1 2 3 4] 
; 5]

didibus · October 25, 2019, 6:02pm

Well, you got me thinking, and I think it be pretty cool if the destructuring was just expanded to support regex operations similar to spec, but always assuming the any? predicate. That could solve all your use cases. Not sure of the exact syntax it would take though.

ikitommi · March 23, 2020, 6:15am

Related to the client-side AOT cache, giving 6x faster load times now with the new guide how to precompile the classes, which is awesome:

➜  ~ mkdir startup
➜  ~ cd startup
➜  startup echo '{:paths ["classes"], :deps {org.clojure/core.async {:mvn/version "0.4.500"}, manifold {:mvn/version "0.1.8"}}}' > deps.edn
➜  startup mkdir classes
➜  startup clj
Clojure 1.10.1
user=> (time (require '[manifold.stream :as s]))
"Elapsed time: 4945.101294 msecs"
nil
user=>
➜  startup clj
Clojure 1.10.1
user=> (time (require '[manifold.stream :as s]))
"Elapsed time: 4970.483792 msecs"
nil
user=>
➜  startup clj -e "(binding [*compile-files* true] (require 'manifold.stream :reload-all))"
➜  startup clj
Clojure 1.10.1
user=> (time (require '[manifold.stream :as s]))
"Elapsed time: 819.835103 msecs"
nil
user=>
➜  startup clj
Clojure 1.10.1
user=> (time (require '[manifold.stream :as s]))
"Elapsed time: 823.341661 msecs"
nil
user=>

system · September 21, 2020, 6:15pm

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.