Next.jdbc -- early notes

seancorfield · January 10, 2019, 4:21am

Following on from my note in “What HTTP server library to use” when I mentioned clojure.java.jdbc and next.jdbc to @ikitommi, here’s some very preliminary data from next.jdbc – the next generation JDBC library I’m working on.

Benchmarks for clojure.java.jdbc (provided originally by @ikitommi) – this is comparing the fastest equivalent operation from clojure.java.jdbc with the raw Java equivalent:

Repeated reducible query with raw result set...
Evaluation count : 206718 in 6 samples of 34453 calls.
             Execution time mean : 3.223544 µs
...
Raw Java...
Evaluation count : 511584 in 6 samples of 85264 calls.
             Execution time mean : 1.190587 µs

So it’s 2.7x slower than the Java code.

Here’s the very first, raw next.jdbc version out of the box with no tweaks at all:

Evaluation count : 267834 in 6 samples of 44639 calls.
             Execution time mean : 2.272091 µs

So that’s 1.9x slower than raw Java.

And these are both in the pathological case of a very small in-memory database with just four records (and selecting by a non-index column to find the first matching row) – so the overhead of the library is deliberately magnified.

Also, a basic select of just the first matching result in clojure.java.jdbc returning a full hash map took about 6.4 µs in clojure.java.jdbc and takes about 4.2 µs in next.jdbc (and you get qualified column names automatically in the latter).

It’s all still very much “hammock work” right now with minimal actual code, but the core functionality is being proved at this point.

seancorfield · January 10, 2019, 7:08am

Now down to 1.8 µs and 3.2 µs (from 2.2x and 4.2) – by using protocols more heavily at the connection/datasource level (which surprised me a bit – but getting rid of records seems to be worthwhile here).

ikitommi · January 10, 2019, 8:35am

Is the source code available somewhere? Will this be released under clojure organization in GitHub or could it be under clj-commons?

There is also Funcools clojure.jdbc, with simplicity & perf as goals, maybe merge some ideas & impl from it too? https://funcool.github.io/clojure.jdbc/latest/#faq

l3nz · January 10, 2019, 11:34am

Well done! looking forward to using it. JDBC is one of those places where small gains will really compound across all clients using it.

seancorfield · January 10, 2019, 5:30pm

Not yet. It’s in serious hammock/exploration mode for now because the API and implementation are changing day-to-day.

I don’t know yet. Which is also part of why the source code isn’t public yet. I may fold it into clojure.java.jdbc or I may release it outside of Contrib.

That project started life by copying code from clojure.java.jdbc and removing the copyright and license files – until I called the guy out on the main Clojure mailing list. It’s also had no updates in two and a half years.

seancorfield · January 11, 2019, 3:38am

A bit more protocol stuff got those timings down to 1.6 µs and 2.6 µs.

I’m working through some ideas and idioms for how best to handle options at different points in the workflow. clojure.java.jdbc allows you to use a hash map for the db-spec and you can have global default options in there, which are merged into the options passed into every function. That’s flexible – and useful to have global defaults in clojure.java.jdbc where there are so many knobs and dials to tweak – but it also adds to the overhead in clojure.java.jdbc.

Having taken away the options for entities and qualifier and using a reducible query as the basis instead of :row-fn and :result-set-fn, means that pretty much the only options left would be actual JDBC options that are needed, mostly in the creation of PreparedStatement…

…What options do folks commonly use in clojure.java.jdbc today?

joost · January 11, 2019, 9:17am

Hi Sean

Most recent projects I used entities en identifiers to do some custom conversion (snake_case_sql to :clojure-style-keywords).

Joost.

kirillsalykin · January 11, 2019, 2:15pm

I also used entities and identifiers.

seancorfield · January 11, 2019, 8:09pm

Good to know. One of the big slowdowns in clojure.java.jdbc is the conversion of the ResultSet to a vector of hash maps with keyword keys – that runs through the identifiers naming function.

entities isn’t such a big deal since it isn’t used for queries, only inserts and a couple of other functions that generate SQL before running it.

Part of what helps next.jdbc be so fast is removing the ResultSet conversion (see raw result set handling in reducible-query in clojure.java.jdbc) or at least pruning it as much as possible when rows are inflated to hash maps.

The other thing that helps next.jdbc be so fast is that it uses protocols directly extended to certain java.sql and javax.sql interfaces – but that means it’s pretty much impossible to carry default options around, although they can be passed into specific functions.

My “ideal” would be that you could provide global naming strategies and override them on a per-call basis but by the time you get to a Connection, you’ve lost that information. I’m still thinking about how best to handle options for this sort of thing.

The next.jdbc library will not be API-compatible with clojure.java.jdbc (for all sorts of reasons) but there are still some features that need to implemented over the new API to provide functionality that folks rely heavily on.

kirillsalykin · January 14, 2019, 8:55am

Just thinking loud:
maybe it is good to drop those kind of things from jdbc.next (identifiers and entities)?
Like Unix way, it will one job - sql utilization.

And have some other tool to handle mapping.

Phill · January 14, 2019, 10:17am

Instead of full hash maps, have you tried structmaps? A SQL result set is a very good fit for a structmap:

Jillions of instances with exactly the same fields, which are best discovered at run-time.
No methods, thus no need for dynamic dispatch.
The structure itself can be garbage collected, unlike a record class.

seancorfield · January 14, 2019, 6:48pm

StructMaps have been deprecated for a very long time – the recommendation is to use records instead.

reducible-query with raw result sets avoids using hash maps and that’s how it gets a lot of its performance. next.jdbc builds on that approach and can reconstitute a row as a regular hash map on-demand and is much faster than clojure.java.jdbc’s code for building hash maps.

FWIW, clojure.java.jdbc used to use structmaps, a long time ago, and was changed to use plain maps instead because structmaps were problematic (that change was years ago and I no longer remember the specifics of the change).

Phill · January 15, 2019, 12:19am

I wrote in seriousness. When records were introduced in Clojure 1.2, a sentence was added, “Most uses of StructMaps would now be better served by records.” Use of structmaps for JDBC records is not one of those uses.

In historical perspective, Rich Hickey’s announcement of Clojure 1.2 on Aug. 19, 2010, was a major event. New records and types brazenly encroached on structmaps; reify overlapped proxy. At the same time, agents got serious, and get-in and group-by were added. In that moment, structmaps looked a bit dusty. Conditioned by years of Java OSS that repudiated itself with every successive release, people may have figured that structmaps were history.

Issue JDBC-15(https://dev.clojure.org/jira/browse/JDBC-15) marked the replacement of structmaps with regular maps in clojure.java.jdbc. I think it was purely, and with good intentions, on the expectation that structmaps were unreliable as a language feature.

But they have not gone away, and the management hsa demonstrated unwavering commitment to backward compatibility.

All in all: if structmaps help, use them!

seancorfield · January 15, 2019, 1:11am

Certainly, when I made the change that implemented JDBC-15 (over eight years ago!), it was on the assumption that StructMaps would actually go away at some point (and that seemed to be the mood of the Clojure/core folks at the time) – although, as you say, history has shown that almost nothing deprecated has actually ever been removed to this point.

A quick benchmark of create-struct / apply struct vs into {} / map vector does indicate StructMaps are faster but, as I indicated, a lot of the speedup in reducible-query (and also in next.jdbc) comes from not even attempting to create hash maps for rows but delegating to the ResultSet directly.

When I get back to working on next.jdbc, I’ll benchmark StructMaps for the case where a row can be materialized (not the main path now) and I’ll talk to Alex and Rich and see how they feel today about reintroducing StructMaps into Contrib code.

seancorfield · January 15, 2019, 1:33am

It does look like there might be some peculiarities with StructMaps in the modern age of qualified keywords:

user=> (struct (create-struct :q/a :q/b :q/c) 1 2 3)
#:q{:q/a nil, :q/b nil, :q/c nil, :c 3, :b 2, :a 1}

(although keys and vals return sane results for this and it seems that spec’s s/keys validation is also happy with it)

seancorfield · January 15, 2019, 6:05pm

I asked Alex about StructMaps – he said he’s never used them because they were effectively deprecated before he started using Clojure so he’s always used records. He also said he’s never heard Rich talk about StructMaps and that they were likely never intended to work with qualified keys (like records). He did say that they have no intention of removing them tho’…

system · July 17, 2019, 6:05am

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.