Following some conversations we had at the conj we wrote a short article about using Clojure to convert CSV into parquet files.
Interest is certainly building around parquet as a file format, and we think it’s good tech. This is potentially exciting if you work with tabular/columnar data and want it to load fast.
1 Like
Did you wrap the Java thing, or did you implement it from scratch?
Is the source code published?
Great questions!
That article is about a new single-jar deps-only lib that simplifies doing the work described in the article:
The heavy lifting is done by our library for tabular data processing tech.ml.dataset
(TMD)
hth - keep the good questions coming!
1 Like
Thanks!
That looks like a lot of work that went into it.
1 Like
You’re welcome.
And indeed, this is just a small moment in a much larger undertaking of trying to understand “functional data science”.
Thanks for looking into it.