Following some conversations we had at the conj we wrote a short article about using Clojure to convert CSV into parquet files.
Interest is certainly building around parquet as a file format, and we think it’s good tech. This is potentially exciting if you work with tabular/columnar data and want it to load fast.
Did you wrap the Java thing, or did you implement it from scratch?
Is the source code published?
That article is about a new single-jar deps-only lib that simplifies doing the work described in the article:
The heavy lifting is done by our library for tabular data processing
hth - keep the good questions coming!
That looks like a lot of work that went into it.
And indeed, this is just a small moment in a much larger undertaking of trying to understand “functional data science”.
Thanks for looking into it.