Since the beginning of 2021, we’ve had a habit of a monthly thread where people could share their hopes for the emerging data science ecosystem.
Through periodical updates, we may help each other catch up and think about the bigger picture, and the way our efforts may tie together. It’s also a good way for each of us to remind ourselves individually of what we have done, and what we would like to do in the near future.
It would be great if you all would consider the following questions and briefly mention your views towards them. Please skip anything that you find irrelevant. Keep in mind, these are only prompts to get you thinking.
Are you working on anything related to the Clojure ecosystem for data science / scientific computing / data tooling / data engineering? Let us know about it.
Have you been doing anything interesting in the last month?
Is there any new realization or change in your hopes and beliefs about the ecosystem’s future?
What are you hoping to create/learn/explore in the coming month? … and in the coming 3 months?
What developments are you hoping to see in the ecosystem and community in the coming month? … and in the coming 3 months?
Also: if you are interested to see what you or others have written in the past few months here are some links to the previous threads:
I just started with Malli, but I see it very promising to further “protect” the public methods in scicloj.ml from wrong data.
One of the biggest “beginner hurdles” in scicloj.ml are to me “cryptic errors” in case of wrongly using its API (= passing wrong things to the functions)
I have been working on tools in the machine reading and discovery space, starting with Forth in the 1980’s, through Java and some Python these days, but now migrating to Clojure and quite possibly Datalog. Machine reading and NLP seem ripe for clojure/datalog exploration.
This is part of a larger project I’m in the middle of, so there is a lot of “extra” stuff going on with curling GeoTIFFs and wrangly GeoJSON - so sorry about the huge code blocks. You can scroll past that to see the results towards the end. I think given what it’s doing the result is actually pretty minimal in terms of lines of code
This is just a proof of concept. Lots of small issues… weird corner cases with huge shapefiles, code isn’t designed to work in the Southern Hemisphere… stuff like that