Datascript datalog queries return different results but are (to a novice datalog user) similar

caleb · November 20, 2020, 7:14pm

Hello,

I recently ran into a bug in my Clojurescript project which uses a Datascript database.

I am summing up a bunch of transactions associated with an account. In my query I pass in an EID of an account and return the :plaid-transaction/amount fields of each of that account’s transactions.

It seems that depending on the order the transactions are put into the database, the first query below returns a different number of results. I performed a shuffle when inserting records into the datascript database and it impacts the results of the first query below.

The second query, which uses the pull syntax inside of a query returns the correct result each time.

I ran both queries against the same database back-to-back and would sometimes get different results. I understand that the first query might not return the same number of records if some of the records are missing a :plaid-transaction/amount field, but when looking at the results of the second query there weren’t any nil values (as well as looking at my input data).

Not Consistant

(ds/q '[:find ?amount
        :in $ ?account-eid
        :where
        [?account-eid :plaid-account/plaid-transactions ?eid]
        [?eid :plaid-transaction/amount ?amount]]
      @conn
      account-eid)

Consistant

(ds/q '[:find (pull ?eid [:plaid-transaction/amount])
        :in $ ?account-eid
        :where
        [?account-eid :plaid-account/plaid-transactions ?eid]]
      @conn
      account-eid)

What is logically different between these two queries?

madbonkey · November 20, 2020, 7:52pm

Welcome to clojureverse!

I‘m not familiar enough with datascript in regard to your second query, but it seems to me that these could be equivalent (to be clear I don’t know), so your transaction logic might be of interest. Can you share how and what you are transacting, and the shuffling part?

caleb · November 20, 2020, 10:00pm

Thanks for the reply.

For the shuffling, I’m doing that server-side in Ruby with the built-in Array#shuffle method. It just randomizes the order of an array.

As for transacting the data in Datascript here is my method for doing that:

(defn load-account-detail-results [conn plaid-account-id results]
  (let [transactions (:plaid-account/plaid-transactions results)

        ; Find existing transaction IDs
        txn-eids (ds/q '[:find ?transaction-eid
                         :in $ ?plaid-account-id
                         :where
                         [?a-eid :plaid-account/id ?plaid-account-id]
                         [?a-eid :plaid-account/plaid-transactions ?transaction-eid]]
                       @conn plaid-account-id)
        ; Generate the retractEntity queries
        delete-transactions (mapv (fn [[eid]]
                                    [:db/retractEntity eid])
                                  txn-eids)]

    (ds/transact! conn (into []
                             (concat
                               delete-transactions
                               [{:plaid-account/id                 plaid-account-id
                                 :plaid-account/plaid-transactions transactions}])))))

First I delete the existing transactions on the account entity and then add the new ones.

I’m thinking that I am not understanding a fundamental aspect of how the datalog queries work. I will try to create a replicatable test case.

caleb · November 21, 2020, 9:23pm

Okay, I found the issue. When using the query with the pull find clause, Datascript returns all of the entities, but when using the non-pull version, duplicate values are filtered out. I’m guessing this is just how the query language works.

This means that duplicate values are not returned.

I don’t know why shuffling the input gave me different results in my javascript project, but when I made a test project in Clojure with Datascript I got the same results when I shuffled my inputs.

system · May 23, 2021, 9:24am

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.