Tooling for data-science community projects - short-term plans

Hi. This post is an opinionated update following the recent monthly visual-tools meeting. It tries to clarify some plans for the coming few weeks regarding our preparation of tooling for certain community projects.

Many thanks to @Christopher_Small, @kiramclean, @Daniel_Gerson, @mkvlr, @Lukas_Domagala, @PEZ, @Paul_Iannazzo, @ritchie_cai, and others, for recent help and thoughtful discussions.

Thoughts & ideas would help. :pray:

context

Recently, @kiramclean shared an update about a few current data-science projects, including the clojure-data-cookbook and the ds4clj course.

We will create a lot of content – code & notes – in the course sessions, in participants’ projects, etc.

The main current reason for delaying the ds4clj course, as well as other study sessions and workshops, is the need to figure out related questions regarding tooling & setup in creating & sharing such content.

This post describes a current approach we have in mind to prepare for the course. Possibly, it would be helpful for the cookbook project as well.

needs

Our choice of tooling should satisfy a few needs:

  • data visualization: given a value, it should be possible to get it visualized

  • literate programming: given a whole namespace, it should be possible to render it as a static HTML page of notes

  • kind-inference: our tools can infer the appropriate visual representation of a given value (by sensible defaults and user choices)

  • namespace-as-a-notebook: code should work as a usual Clojure namespace – people should be able to use it in their dev environments (CIDER, Calva, etc.)

  • copy-paste friendly: gradually, we’d like to write adapters to allow running & rendering the same code in different tools (Portal, Clerk, etc.)

  • easy: setup & usage should be seamless, at least in Calva, which is our recommended environment for beginners

  • stable: notes written today should work in the future

  • lean: we should converge on a stable solution in very few weeks – not to delay the ds4clj course any further

current plan

  • Clay is a work-in-progress attempt to satisfy those needs.

    • Its approach is to wrap various tools (Portal, Clerk, Scittle) with a layer of interaction and kind-inference.
    • It is still partial and unstable.
  • Following discussions with @Christopher_Small, there is a hope to achieve what we need by wrapping a subset of Oz with a subset of Clay.

  • That Clay-Oz adapter is preferable over the current Clay-Scittle adapter:

    • It will hopefully allow enjoying more advanced Oz features (e.g., live-reload, parallel computation) in the future.
    • It will allow the integration of some kind-inference concepts of Oz.
    • It will allow removing some code duplication (as the current Clay-Scittle layer actually does things that could also be done in Oz).
    • It is an opportunity to collaborate with @Christopher_Small!!
  • My current task is implementing a Clay-Oz adapter and demonstrating it with a hello-world namespace.

  • When that proof-of-concept is ready, we will discuss and maybe improve its kind inference.

  • In terms of editor environments, we will provide the needed setup (e.g., relevant keybindings) at least in Emacs & Calva.

  • When we have a stable version, we will start using it in teaching, etc.

Calva support

Our current main hopes for Calva integration are the following:

  • Implement keybindings for common tasks (that typically call corresponding Clojure functions with the appropriate context). E.g., eval-and-visualize a given form as we currently do, or statically-render a whole namespace.

  • Use the VSCode webview to view visualizations (single values as well as whole HTML pages).

  • Make all that seamless to install & use.

  • Make things work sensibly with remote machines.

    • Take care of port forwarding, as is possible in VSCode).
    • Possibly solve some performance problems for slow connections.

adapters to other tools

In addition to the current hope for partial Clay-Oz integration, we are hoping to write adapters for additional features of Oz (e.g., the live-reload build) and other tools. That might happen only later – after our needs clarify further and when we have the capacity to work on that.

Currently, Clay does not work with the newest versions of Clerk. It does partially work with Portal. Other relevant tools, such as Goldly, the upcoming Calva Notebooks that @Lukas_Domagala recently shared, etc., are not supported yet.

Recently, Clay got some great help from @Daniel_Gerson and @mkvlr, but there is still work to do before it can work with the new Clerk.

future

Our short-term plans seem very concrete at the moment, but they are trying to keep the long term in mind.

A few fantastic tools are still evolving. We should try to build a layer would connect with them in the future, without assuming too much about them in the present.

7 Likes