Online meeting: clojure data science

So, what does the actual rendering? Are you saying all of it is written in Clj with no C/Java/?? rendering engine???

@jsa-aerial: thing is super impressive itā€™s pretty much built from the ground up with few dependencies. Visualisation and data science are very much intertwined and it would make a great native visualisation package. My main issue is that itā€™s developed using tangle/org mode and I have no idea how to contribute to the project.

1 Like

No longer true. Author switched to a normal repo last year on the no org branch.

@jsa-aerial
Iā€™ve been using it as a supplement to quil with some deferred rendering stuff, although thereā€™s both a minimal webgl backend and a java opengl backend (and some projected rendering to 2d svg). Theyā€™re more of the camera, lights, triangles and shaders minimalism, with working examples using the thi.ng libraries to compute the inputs in a higher level fashion. Lots of control though.

3 Likes

Not a big fan of the DEVS formalism for my work. Evaluated simpy years back but dumped it in favor of custom solution (and some other external considerations). I remember seeing some ill advertised libs in scala and java (mostly from academia). I was pretty impressed with JaamSim (although itā€™s oop and heavy on mutation) talking to folks at WinterSim years back; they were aiming at competing with proprietary software while remaining open source. ā€œData Scienceā€ tends to generally ignore discrete event sim and optimization (bread and butter Ops Research fields) in my experience. They may remain niche/external, or poorly defined terms may end up encompassing both into ā€œData Scienceā€ at some point.

I intend to wrap my incanter fork around kixi.stats in the near future (no idea if upstream would dig that). Great library (also portable cljc).

So how is it going to develop? Are we creating a subsection of forums so that we can discuss for each part?

I think the current master branch has moved away from org-mode(?). I have considered starting to use org-mode myself lately, but this is the reason why we opted against it. It is kind of the best literate programming environment though. Do you still have plans to evolve hydrox?

1 Like

Hey everybody!

Really nice to see all the shared interests here. I am maintaining Anglican and I am curious to hear peopleā€™s needs and interests. I am also in general working with Clojure in backend and frontend settings besides my research and think it is a nice fit. The recent Anglican version for example runs also in ClojureScript.

Best,
Christian

5 Likes

thi.ng creates SVG or webgl itself.

2 Likes

As a user and sponsor of kixi.stats Iā€™d be into that. I worry a bit about carrying on with incanter as it is just so big.

As a company we definitely do cohort component, DES and MCMC so weā€™d like to see more stuff there. Not everything is a neural network. :slight_smile: (I kid! I kid! pls donā€™t hit me actual data scientists!)

@joinr Do you think it is worth factoring out your DES stuff from spork into its own library?

1 Like

@Neo, re

There are several options to consider ā€“ let us discuss this under that separate topic:

@whilo I think Iā€™d like to understand when I should be using Anglican and when I should be using other Bayesian and probabilistic tools. With all of these I think there are trade offs around speed or fitting in with particular frameworks or expressiveness that we need to think about. Knowing where each of our options fits would be great.

hydrox has evolved (twice) though Iā€™m using more of a repl based bulk namespace approach as opposed to a file-watch approach (I found it inconsistent and needed resetting once in a while). The current landscape for docs isnā€™t too bad with cljdocs.

Are there any crossovers between anglican and bayadera? itā€™d be really helpful to have some sort of a comparison matrix between not just the pp libraries but also the ml ones as well. I understand ā€˜conjā€™ and ā€˜assocā€™ really well. Itā€™d be great to have some of the more difficult concepts simplified, explained and compared.

I really like the API of kixi.stats, for what itā€™s worth.

I also love Karstenā€™s thing libraries and have long championed them to the clojure community, but would caution us to look carefully at the feature set of VEGA. geom is a great foundation that certainly could serve this purpose, but we should be clear on how much work is involved to bring it to feature parity with VEGA.

Ah, the implementation of kixi.stats (api and guts) is all @henrygarner so he should get the kudos.

I agree that VEGA based things would be a very good place to concentrate our efforts, especially given the grammar of graphics and grammar of interaction that theyā€™ve thought about and appears very successful.

If we want to spend a lot of time having fun we could make that grammar also work with thi.ng (which Iā€™m a fan of as well). I donā€™t think we should be betting on thi.ng directly at the moment.

3 Likes

We continue the discussion with some better granularity at Zulip, under the data-science stream.

For background on Zulip, see this discussion.

It is recommended to know a bit about Zulipā€™s concepts of streams and topics.

See you there!

Do you think it is worth factoring out your DES stuff from spork into its own library?

For various values of ā€œworthā€ :slight_smile: spork is an intentionally accreted monolith, the bits of which are in various states of use (and active development). Itā€™s definitely designed for modularity though, and Iā€™ve looked at breaking out bits into separate libraries (but lack the external motivation) either manually or automatically. In practice, the DES stuff uses/exploits bits of the ecosystem, like an entity component system, behavior trees, and some minor stats libraries. In the mid-long run, Iā€™d shift towards using something to extract the dependencies during publication. I also need to port some rudimentary examples (currently half baked). Iā€™m more concerned with production than adoption at the moment though (although happy to answer questions).

Personally, Iā€™m more interested in refactoring and applying fixes to Incanter since we continue to leverage it in internal processes. I got bogged down in refactoring the plotting implementation (all multimethods and macros, heavily tied to JFreeChart too, which involved munging through the JfreeChart docs). Iā€™m looking hard at adding a vega backend with a compatible porcelain API from incanter.charts (rendering to browser/html, or javafx webview to eschew spinning up a server). Incanter is big, but itā€™s also modular (via lein-modules), so itā€™s manageable IMO.

1 Like

Iā€™m an Emacs Org Mode user, I usually do some simple statistics on Org Mode ā€œLiterate Programmingā€. I have an suggestion, might create an GitHub README/Wiki page for Clojure Data Science, add link to library, and add detailed description about library. Maybe add some Clojure Data Science useful knowledge there. So people can find information easily. Put all info in one place would help newbie find what they want. WDYT? Looking information around is diffcult for me.

2 Likes

Thanks @stardiviner! Emacs Org Mode is beautiful.

That is a great idea, we are working on a website ā€“ will share in a few days to get some feedback.

Hi!

We made this questionnaire to prepare for the Clojure data science online meeting.

Your response here will be extremely helpful!

Please note that no question is mandatory ā€“ e.g., you do not have to say even your name.

Thanks!

Well, my compliments to the chef :slight_smile:

1 Like