I think the necessary tools are evolving. There’s been fantastic work like https://github.com/MastodonC/kixi.stats and others to develop our own Clojure libraries and as Data Science becomes more and more mainstream, there will be more and more efforts to develop them for the JVM and we, as Clojurists, can leverage that.
Graal is a really intriguing direction but is still a ways off. But again, the JVM lets us take advantage of any advances there too.
I really enjoyed your book “Living Clojure” as it is a succinct introduction to Clojure (I read it twice). Are there any plans for a second edition incorporating newer language features?
You can load a 500,000,000 row, 4 column csv file (35GB on disk) entirely into about 10 GB of memory. If it’s in Tablesaw’s .saw format, you can load it in 22 seconds. You can query that table in 1-2 ms: fast enough to use as a cache for a Web app.
BTW, those numbers were achieved on a laptop.
I haven’t tried it, but it certainly sounds promising
The other thing might be to consider using Datomic instead of SQL. You might be better able to explore the datasets with a datalog query.
There are no plans for an updated edition, but there is another beginner Clojure book with updated features just about to come out. Russ Olsen is very close to shipping “Getting Clojure” twitter announcement. He’s a fantastic writer and the author of my favorite Ruby book Eloquent Ruby - so keep a lookout for it.
Genetic Programming/ Algorithms are very cool. They are being combined lately with other technologies like Deep Learning with great success, like this one that uses it to evolve Deep Learning networks https://github.com/joeddav/devol.
There’s tons of other great blog post introductions on it too. I would just pick one that you find interesting and try to implement it yourself and let your inspiration take it from there
I would like to know what your preferred approaches and libraries are for plotting when doing DataScience or MachineLearning work in Clojure. Not so much for publishing but rather for plots during model building and evaluation.
I haven’t really used much plotting right now in what I’m doing, but I was going to dive into it, I would most likely look at a couple of libraries that I’ve heard people speak highly of:
Thanks everyone! This has been wonderful. Thanks @martinklepsch and @plexus for arranging it. It was great talking to everyone and I’ll see you around the Clojureverse