Hey Max - I’ll copy/paste my response from reddit here, since I think it has some good ideas in it.
Thanks a lot Super interesting.
You’re welcome. Thank you.
We want to rework our system that calculates the business metrics of our SaaS.
Now that sounds super interesting. (:
We struggle to fit our Clojure and Datomic data sources into the Google BigQuery ecosystem. Since you need to convert everything into a relational database schema
Right! It’s interesting that there’s no mention of schema anywhere in the article, however, both TMD and DuckDB are strongly typed; columns are homogeneous and tables are (for all intents and purposes) rectangular.
We’re getting a lot of leverage in this area from two things (1) DuckDB’s CSV import detects data types automatically, and creates the table schema with no additional user input (and it does this surprisingly well, imo). (2) Both TMD and DuckDB know the types of all their columns, so in general data between them can discover and late bind schema information, again with no additional user input.
That hides a lot of the drama associated with ‘the relational database schema’ you mention, but it’s still there under the hood - we regard this as a good thing, this columnar orientation is both why TMD can use so little RAM, and why the DuckDB query engine can be so fast.
would you recommend giving DuckDB + TMD a try for this situation?
So, the answer here is definitely maybe, and it depends, mostly on the exact data shapes and quantities and such. Our typical strategy is to do the actual data processing in Clojure, if possible, which is by far the most flexible, and then to graduate to involving other tools as necessary.
Happy to discuss it further, if you’d like. Fill out our contact form here and we can get an email thread going, or hop on a call: