Datascript datalog queries return different results but are (to a novice datalog user) similar


I recently ran into a bug in my Clojurescript project which uses a Datascript database.

I am summing up a bunch of transactions associated with an account. In my query I pass in an EID of an account and return the :plaid-transaction/amount fields of each of that account’s transactions.

It seems that depending on the order the transactions are put into the database, the first query below returns a different number of results. I performed a shuffle when inserting records into the datascript database and it impacts the results of the first query below.

The second query, which uses the pull syntax inside of a query returns the correct result each time.

I ran both queries against the same database back-to-back and would sometimes get different results. I understand that the first query might not return the same number of records if some of the records are missing a :plaid-transaction/amount field, but when looking at the results of the second query there weren’t any nil values (as well as looking at my input data).

Not Consistant

(ds/q '[:find ?amount
        :in $ ?account-eid
        [?account-eid :plaid-account/plaid-transactions ?eid]
        [?eid :plaid-transaction/amount ?amount]]


(ds/q '[:find (pull ?eid [:plaid-transaction/amount])
        :in $ ?account-eid
        [?account-eid :plaid-account/plaid-transactions ?eid]]

What is logically different between these two queries?

Welcome to clojureverse!

I‘m not familiar enough with datascript in regard to your second query, but it seems to me that these could be equivalent (to be clear I don’t know), so your transaction logic might be of interest. Can you share how and what you are transacting, and the shuffling part?

Thanks for the reply.

For the shuffling, I’m doing that server-side in Ruby with the built-in Array#shuffle method. It just randomizes the order of an array.

As for transacting the data in Datascript here is my method for doing that:

(defn load-account-detail-results [conn plaid-account-id results]
  (let [transactions (:plaid-account/plaid-transactions results)

        ; Find existing transaction IDs
        txn-eids (ds/q '[:find ?transaction-eid
                         :in $ ?plaid-account-id
                         [?a-eid :plaid-account/id ?plaid-account-id]
                         [?a-eid :plaid-account/plaid-transactions ?transaction-eid]]
                       @conn plaid-account-id)
        ; Generate the retractEntity queries
        delete-transactions (mapv (fn [[eid]]
                                    [:db/retractEntity eid])

    (ds/transact! conn (into []
                               [{:plaid-account/id                 plaid-account-id
                                 :plaid-account/plaid-transactions transactions}])))))

First I delete the existing transactions on the account entity and then add the new ones.

I’m thinking that I am not understanding a fundamental aspect of how the datalog queries work. I will try to create a replicatable test case.

Okay, I found the issue. When using the query with the pull find clause, Datascript returns all of the entities, but when using the non-pull version, duplicate values are filtered out. I’m guessing this is just how the query language works.

This means that duplicate values are not returned.

I don’t know why shuffling the input gave me different results in my javascript project, but when I made a test project in Clojure with Datascript I got the same results when I shuffled my inputs.