I need to change my database driver from MySQL 5.7.26 to the latest MariaDB driver 10.5.8. And now queries with CASE statements that should return a Long instead return a BigDecimal. Big trouble since (= 0 0M) is false.
The following function demonstrates the problem. It takes a db connection, creates a table with one integer field, inserts one row with the value 1 and then queries the table using a CASE statement that returns the row. Using the MySQL connection, the value returned is 1 and using the MariaDB connection, the value returned is 1M.
(defn test-it
[conn]
;; Drop table
(jdbc/execute! conn ["DROP TABLE IF EXISTS test_table;"])
;; Create table with one integer field `int_field`
(jdbc/execute! conn ["CREATE TABLE `test_table` (`int_field` int(0) unsigned NOT NULL);"])
;; Insert one row with `int_field` = 1
(jdbc/execute! conn ["INSERT INTO `test_table` VALUES(1);"])
;; Query where `int_field` is returned from a CASE statement
(-> (jdbc/query conn ["SELECT (CASE WHEN 1=1 THEN `int_field` ELSE 0 END) AS `value` FROM `test_table`"])
(first)
:value))
(test-it mysql-conn) => 1
(test-it mariadb-conn) => 1M
This may not be/is probably not a Clojure-specific problem but I don’t know very much about databases or any Java so I cannot recreate the problem outside of Clojure using jdbc. So I’m starting here with the hope that someone smarter than me has a clue
Very interesting. However, don’t MariaDB servers welcome MySQL clients? You might have fewer technical difficulties upgrading to a later jar in the MySQL series.
This isn’t a bug. The MySQL JDBC driver can also return BigDecimal for various expressions where some people might expect a Long – and certainly the exact types of expressions may vary across databases and across JDBC drivers. Despite what a lot of people seem to think, SQL and JDBC are not very portable across databases and drivers. If you look at the tests for clojure.java.jdbc or next.jdbc you will see all sorts of conditional logic to make the tests pass with different drivers and different databases. After maintaining those two libraries for many years, I’m not terribly surprised when something that works identically on half a dozen databases/drivers happens to work differently on a new combination.
We’ve also seen annoying differences between different versions of the same driver against the same database. It’s just one of those things you have to live with.
@Phill. Yes, I may need to stick with MySQL if I can’t solve this.
@seancorfield. Thanks! I was actually not able to use the jdbc driver for MariaDB. Neither :dbtype "mariadb" nor jdbc:mariadb://, even though I included [org.mariadb.jdbc/mariadb-java-client "2.7.2"] in my project.clj. So I used the mysql driver instead.
Perhaps if I could get the mariadb specific driver to work, I would get the correct return type. Do you know how to “activate” the mariadb driver so I can use it when I create the connection?
seancorfield/next.jdbc is tested against the MariaDB driver: org.mariadb.jdbc/mariadb-java-client {:mvn/version "2.5.4"} but your later version should work and :dbtype "mariadb" should be all you need (once you’ve added the dependency).
org.clojure/java.jdbc was never tested against the MariaDB driver, but I think it should “just work” the same way: add the dependency to your project, use :dbtype "mariadb".
Thanks @John_Shaffer. I have code all over the place that assumes that values from CASE statements are Long so I’m afraid not. So I either need to add explicit casts to integers in the CASE statements or simply stick with MySQL. Unless I can get the MariaDB driver to work and it magically resolves the problem.
Ah, for clojure.java.jdbc, you probably also need :classname "org.mariadb.jdbc.Driver" in the db-spec hash map because otherwise c.j.j doesn’t know what JDBC driver class to load.
next.jdbc has MariaDB support built-in (it supports a lot more databases/drivers out of the box than c.j.j).
Would you go as far as to say that you cannot trust the types of return values from databases and need to coerce types before letting them into your code? Or is it enough to spec and check in production code?
I wouldn’t say you can’t trust the types – for a given database and JDBC driver version, they are consistent. But it’s worth checking what types actually come back from a given SQL operation (in the REPL) – and it’s also worth being a bit more careful (or at least forgiving) in terms of comparisons you perform on data coming back from the database (e.g., as John noted above, using == for numeric equality instead of = unless you know exactly what types you are dealing with).
Computed values – as in your CASE expression – are particularly “vulnerable” since the database may use a “larger” type than you expect to avoid potential truncation/overflow.
This is also why it can be important to ensure that your test suites use the same version of the JDBC driver and the same version of your database (even when it is convenient and tempting to run tests against, say, an in-memory DB like H2 or a local DB like Derby or SQLite).
Another thing to watch out for is dates and timezones: in our QA and production environments, we have the server timezone set to UTC, the JVM timezone set to UTC, and the database itself set to UTC – and our code operates entirely in UTC as well – to avoid any timezone-related shifts in values! When you’re running tests locally, you need to consider that as well, since your local machine is not likely to be in UTC so you have to be careful about Instant vs LocalDateTime vs OffsetDateTime vs ZonedDateTime!
Thanks @seancorfield, this is very helpful. The problem is that the Long is used as a map key, so lookup failed when I got a BigDecimal instead. But it’s good to know that this type can change, I need to spec my db return values to identify if that happens again if I try to change database or driver.
I have actually been thinking of rewrite all my tests to use an in-memory db instead of having to maintain a test database. Again that highlights the importance of specing returns.
Yes, I have already set my MySQL timezone to UTC in production and development. The app I have developed has clients (research projects) in different countries, each with their own timezone setting. So to make sure that I don’t have trouble with these settings (daylight savings etc), I randomize the timezone before running each test that depends on timezone settings.