[link] Clojure and data: wishes, problems, and ideas

Hi there! Here is a blog about some of the stuff re Clojure and data science that
@daslu, @konradkuehne and myself discussed in Helsinki after ClojuTRE, kindly summarised by Daniel. We would love to hear your thoughts about this! https://scicloj.github.io/posts/2019-10-18-data-wishes/

3 Likes

Good read! I’m posting a comment here.

Of course, this also means lots of creativity, and many opportunities for
beautiful things to happen. However, being able to converge to common
practices might be important.

I would like to phrase this more strongly. Being able to converge on common practices is exceptionally important, and instrumental if we expect higher-level tools to be built upon those practices.

That being said, I’m really glad you approach this problem with openness in mind, and not merely dictate what should be. That balance is hard. When we standardize innovation, there is going to be pain. I don’t remember his name, but one of the founders of the Internet is bitter to this day, that what we have access to from Firefox is just a poor implementation of his grand vision of hypertext. What I’m saying is that there’s got to be a balance. We will loose some flexibility in the process of standardization, yet we cannot loose track of what we hope to achieve. We cannot loose touch with Clojure’s hammock-driven quality.

In my opinion.

And my impression so far, is that SciCloj might actually be beating the Lisp curse, without compromising the integrity of Clojure.

Keep it up!

Teodor

3 Likes

Thanks @teodorlu.

Do you have some thoughts about what aspects of what we do may require standardization the most?

Honestly, @daslu, I was thinking generally; we need to appreciate common solutions, value those, and build on those.

I also realize that I may have experience and needs that are somewhat particular to me, in a sense this is enumerating “what limits me from taking Clojure into use in my team?”. Specifics are good, but it’s one use case. And for me, it might be “structural engineers working together on a bridge design need to develop a shared data-driven workflow”, where I’d love to see Clojure placed dead center. But as of now, Python just provides way more value short term, making the sell hard. I’m not in a project like that right now, though, and there might be some time until the next time.

Some specific examples from the top of my head (caveat that I haven’t done proper research to verify that this is in fact as hard as I may suggest):

  • A go-to answer for how to work with data in memory. In Python, Numpy and Pandas fills this space. For instance, I’d love to see an opinionated guide to higher-level use of tech.datatype. And do I have linear algebra tooling for working with those? Is there support for LU factorization (in Python, there’s scipy.linalg.lu)? Can I run an iterative equation solver?
  • A go-to solution for visualization. I’ve been looking into Oz a bit, and I’m liking what I’m seeing. Could perhaps use a few more hands, though, for instance to look at whether all those dependencies are needed in all cases.
  • A go-to editor setup for new users. Lots of interesting work here. We’ve seen Klaus’ demonstration of Clojupyter, and @PEZ’ work on Calva; both interested in making a good new user experience.

And more importantly, I’d love to see work on how all of this fits together. Can we provide some recommendations? A getting started guide that provides good defaults? Scoping is such a challenge as well. Stating problems is one thing, but getting to bite-sized, solvable chunks is another.

I’m not sure whether this was really what you were asking about. But it’s as close to a problem formulation I’ve got, for now!

Best regards,
Teodor

3 Likes

Thanks @teodorlu, I think this kind of problem formulation is so important to give us a sense of direction.

At Scicloj, we will be rediscussing our goals and priorities in the next few weeks, looking to create some workplan.

If you (or anyone) wishes to be involved in that, please tell.

2 Likes

One approach that the community is taking is making it easier (if not hopefully trivial) to just use python, r, etc. from clojure directly. libpython-clj, pantera, and some emergent R bindings are aiming for that. I personally find this less compelling due to the need for another foreign dependency, but I think it’s a really smart move (e.g. build bridges, lower the barrier to entry). On top of that, the tech stack provides a basis for shuffling data representations to/from/between these foreign libs pretty easily. That could be a near (maybe even long term) fix to allow clojure to be in the driver’s seat.

3 Likes

I think I’d might like to be involved in that. Where do I sign up?

@teodorlu great!
adding you to the scicloj-org zulip stream …

1 Like