Next.jdbc -- early access

seancorfield · April 1, 2019, 6:27am

I’m finally at a place where I feel comfortable showing an early working draft of next.jdbc to the world: https://github.com/seancorfield/next-jdbc/blob/master/src/next/jdbc.clj – although it is not API-compatible with clojure.java.jdbc, it is almost at feature parity at this point.

Feedback here, or via issues is all fine. The code is substantially shorter and simpler than clojure.java.jdbc and the performance is substantially better.

swlkr · April 1, 2019, 6:00pm

I’m new to datafy (I guess everyone is), I’m not quite sure what you mean by https://github.com/seancorfield/next-jdbc/blob/master/src/next/jdbc.clj#L15-L17

Could you explain a little more about the navigation feature? Or is there an example?

ikitommi · April 1, 2019, 6:19pm

Congratulation, looks so much better now! Few quick wins for perf, would be happy to do a PR of those. About the api, could the with-transaction be just a function? Or a new functional with-transaction-fn? Usually adding a functional version into the projects.

(j/with-transaction 
  db
  (fn [db] ...))

seancorfield · April 1, 2019, 6:27pm

This blog post (from the time of last year’s Conj) gives more background on lazy navigation through result sets: http://corfield.org/blog/2018/12/03/datafy-nav/

The easiest way to see it in action is to use Cognitect’s REBL (linked from that blog post) with next.jdbc and point it at your favorite database and run execute! on a select that pulls in rows that have foreign key columns that match either <table>id or <table>_id and see it automatically navigate into those tables when you drill down into those columns.

Hope that helps?

seancorfield · April 1, 2019, 6:29pm

(p/-transact db (fn [db] ...) opts)

That could be used directly since -transact is a public protocol. with-transaction is just sugar over that.

ikitommi · April 1, 2019, 6:45pm

Nice, could the p/-transact be exposed as transact in the main ns too?

seancorfield · April 1, 2019, 6:51pm

didibus · April 1, 2019, 11:17pm

Looks real nice.

I just started on a new project using java.jdbc , any timeline as to when we can expect a release for next.jdbc? I’m hoping if its soon enough I can make the switch before it’s too late

seancorfield · April 2, 2019, 12:28am

If you’re using deps.edn, you can depend on it right now as a Git SHA

Some functions will likely move around a bit and the signatures may change a little (regarding the required-ness of opts and how :row-fn works). Keep an eye on the open issues.

It’s also very light on documentation – which won’t really come until after I finalize the API.

I’m still on the fence about Contrib vs non-Contrib, i.e., whether I want to position this as an evolution of clojure.java.jdbc or a full-on replacement for it.

With those caveats, if you’re willing to be an alpha tester for it as-is, your feedback would be gratefully received and appreciated!

seancorfield · April 2, 2019, 3:45am

I ran some benchmarks comparing (simple) operations in next.jdbc and clojure.java.jdbc and added the timings to the test file: https://github.com/seancorfield/next-jdbc/blob/master/test/next/jdbc_test.clj#L53-L105

seancorfield · April 2, 2019, 7:54am

transact has been added as a top-level API function.

didibus · April 2, 2019, 4:11pm

Unfortunately, we have custom build tools, and they can’t pull from git/shas

ikitommi · April 2, 2019, 8:26pm

on transact. Did two small perf PRs and an issue.

TristeFigure · April 3, 2019, 9:20am

Cool. What’s the rationale of next.jdbc in comparison with clojure.java.jdbc ? What led you to develop it ?

seancorfield · April 4, 2019, 12:12am

There are a number of drivers for a new version of clojure.java.jdbc:

Performance: there’s a surprising amount of overhead in how ResultSet objects are converted to sequences of hash maps – which can be really noticeable for large result sets – so I want a better way to handle that.
A more modern API, based on using qualified keywords and transducers etc: :qualifier and reducible-query in recent clojure.java.jdbc versions were steps toward that but there’s a lot of “legacy” API in the library and I want to present a more focused, more streamlined API so folks naturally use the IReduceInit / transducer approach from day one and benefit from qualified keywords. I’m still contemplating whether there are reasonable ways to integrate with clojure.spec (for example, if you have specs of your data model, could next.jdbc leverage that somehow?).
Simplicity: clojure.java.jdbc uses a variety of ways to execute SQL which can lead to inconsistencies and surprises – query, execute!, db-do-commands are all different ways to execute different types of SQL statement so you have to remember which is which and you often have to watch out for restrictions in the underlying JDBC API.

Those are my three primary drivers. In addition, the db-spec-as-hash-map approach in clojure.java.jdbc has caused a lot of frustration and confusion in the past, especially with the wide range of conflicting options that are supported. next.jdbc is heavily protocol-based so it’s easier to mix’n’match how you use it with direct Java JDBC code (and the protocol-based approach contributes to the improved performance overall). There’s a much clearer path of db-spec -> DataSource -> Connection now, which should steer people toward more connection reuse and better performing apps.

I also wanted datafy/nav support baked right in (it was added to clojure.java.jdbc back in December as an undocumented, experimental API in a separate namespace). I wanted it to be “free” in terms of performance (it isn’t quite – my next round of changes should address that).

The API so far is still very much a work-in-progress. I’m still very conflicted about the “syntactic sugar” SQL functions (insert!, query, update!, and delete!). They go beyond what I really want to include in the API, but I know that their equivalents in clojure.java.jdbc are heavily used (based on the number of questions and JIRA issues I get).

My latest round of changes exposed the mapped-function-over-rows API more prominently, but I’m still not happy with the “feel” of that aspect of the API yet (it creates a tension with the datafication behavior).

So, while I’m comfortable to put it out there and get feedback – and I’ve had lots of great feedback so far – expect to see more changes, possible some dramatic ones, in the next month or so before I actually settle on where the library will live and what the published artifacts will look like. (@didibus this is a change of position compared to what I said above two days ago – based on extensive discussions with someone at Cognitect since then – sorry!).

GKlijs · April 4, 2019, 6:19am

For the people not on slack. I have a small microservice I was already benchmarking, with just 6 functions using the database. It was very easy to change from clojure.java.jdbc. Performance was about the same, cpu a bit lower. Most importantly there were thousands of transactions during testing which all went well.
Great work.

TristeFigure · April 4, 2019, 11:43am

Thanks for the detailed explanation Sean. I use clojure.java.jdbc mainly through the korma lib so I have not an extended experience with it, but I have to reckon I am disoriented by jdbc’s API when I have to directly use it so I’m glad to hear you’re working on a simplified API query, execute!, etc…

Also is there any plan for supporting result sets streaming ?

connection = PG::Connection.open(...)

# Note: This will NOT execute query as soon as this line is evaluated by Ruby!
connection.send_query("SELECT * FROM gigantic_query")
connection.set_single_row_mode
connection.get_result.stream_each do |row|
  # Do something with the row!
end

Source: https://medium.com/@johnmcclean/jdbc-processing-options-with-cyclops-react-49d62b02f775

seancorfield · April 4, 2019, 4:35pm

Unfortunately, getting database drivers to stream results is highly database-specific. I know people who have succeeded in doing this for both MySQL and PostgreSQL – but they needed different configurations. So, the short answer is “yes, you can do this” – but not in any way that is portable and therefore not in any way that either clojure.java.jdbc or next.jdbc can directly support out of the box.

You typically need to set the connection to not auto-commit (there’s a keyword argument for that, in both libs), and you need to set fetch-size (again, there’s a keyword argument for that), and then some other incantations. And you need to use the reducible-query! in clojure.java.jdbc or reducible! in next.jdbc.

seancorfield · April 5, 2019, 7:39pm

I’ve written up more of my thoughts about the motivation and design of next.jdbc (some of it based on my comments above): https://github.com/seancorfield/next-jdbc/blob/master/README.md

What’s clear to me at this point is that I need to rethink the row realization aspect of the library and I need to focus more on the three primary intended use cases (single row fetch, multi-row fetch, process result set).

Thank you, everyone, for all the feedback so far!

ikitommi · April 6, 2019, 12:20pm

Will the default be to return maps with namespaced keywords in the future? Is there support for other return types? I’m bit worried with the “qualify all the things” in the low level libraries, at least before Records can be optimized to support them. There was some discussion that the next version of Ring could also lean on namespaced keys(??).

About perf: to get close to the native java performance, we should do two things:

Precompile the queries: given an connection and optionally parameters, the query is just executed. All options, factories etc. are done in the (pre)compile stage.
Fast immutable query data: Persistent maps are much slower than Classes/Types/Records.

For an average low traffic enterprise app, fully dynamic response maps with qualified keys read from ResultSet metadata are just great (and a good companion to clojure.spec), but for high traffic (and high profile!) apps, we should be able use idiomatic clojure with ~Java-speed.

Did a small spike of porting some of the perf stuff from our libs into a minimalistic query-only jdbc wrapper: it allows statically populating return records, generating response records from (compiled) query, precompiling everything, including map responses etc. In the simple criterium micro-benchmarks, it seems to be 10x faster on query-side (including a roundtrup into the h2 memory database!). Not sure if I’m using next.jdbc correctly yet, so the results might be totally bogus. Please correct if I’m using it wrong.

Anyway, I think that the clean separation of compilation and execution time would allow a single library to serve both use cases: fast tools for those (high-profile apps/startups) who need the perf and a “simple” api layer that just runs the compilation and the execution in a sequence, effectively giving the current api (and perf) of current jdbc.next.

What do you think? How to proceed? Big internal refactor PR to jdbc.next? Something else? Would be great to have just one solid and maintained solution for the community, and outside of the contribs.

Quick results, a modified criterium suite:

(defrecord Fruit [id name appearance cost grade])

;; java-fast mapping
(defn rs->fruit [^ResultSet rs]
  (->Fruit
    (.getObject rs 1)
    (.getObject rs 2)
    (.getObject rs 3)
    (.getObject rs 4)
    (.getObject rs 5)))

;; 680ns (vanilla java)
(bench!
  "java"
  (java-query "SELECT * FROM fruit" con))

;; 690ns (hand-crafted function to populate the record)
(let [query (p/compile "SELECT * FROM fruit" {:row rs->fruit})]
  (bench!
    "p: manual, record"
    (query con)))

;; 710ns (infer code of rs->fruit from a given record class)
(let [query (p/compile "SELECT * FROM fruit" {:row (p/rs->record Fruit)})]
  (bench!
    "p: derived, record"
    (query con)))

;; 710ns (create a new record for the unique resultset using compile-time inspection 
;; + a positional constructor for it, memoized + macro / eval => know what you are doing!)
(let [query (p/compile "SELECT * FROM fruit" {:con con :row-gen p/record-row-gen})]
  (bench!
    "p: compiled map"
    (query con)))

;; 1800ns (precompiled fully dynamic map result, just like next.jdbc / java.jdbc)
(let [query (p/compile "SELECT * FROM fruit" {:con con})]
  (bench!
    "p: dynamic map"
    (query con)))

;; 3000ns (fully dynamic map result, just like next.jdbc / java.jdbc)
(let [query (p/compile "SELECT * FROM fruit")]
  (bench!
    "p: dynamic map"
    (query con)))

;; 6700ns
(bench!
  "next.jdbc: reducible!"
  (into [] (map (partial into {})) (j/reducible! con ["select * from fruit"])))

;; 7000ns
(bench!
  "next.jdbc: execute!"
  (j/execute!
    con
    ["select * from fruit"]
    (partial into {})
    {}))

;; 8000ns
(bench!
  "java.jdbc: query"
  (j1/query {:connection con} ["SELECT * FROM fruit"]))