Memory efficient vega-lite reagent components

I’ve been experimenting with using vega-lite in clojurescript to create some simple animated graphs using the reagent components provided by OZ from @Christopher_Small and the underlying Hanami library from @jsa-aerial.

See the visualisations I created here. The visualisation in the first tab performs OK.

The graphs are animated by updating the atom containing the data for the graph which automatically updates the graph in the DOM.

I’m finding though that with more complex animated graphs the memory used by my javascript very quickly starts to mount to 100s of MB! With each update of the graph the heap size increases. See for example animation in these tabs:

I wonder:

  • Is there a problem with garbage disposal in the Oz implementation of vega-lite reagent components that we could address? Or is the issue the way I have coded these visualisations.
  • I can’t find a function in the Hanami library to include individual vega-lite reagent components in hiccup but I wonder if the underlying hanami library allows for more memory efficient updating of the graphs?

I created a page embedding a separate build to better isolate what seems to me to be a memory leak:

https://jointprob.github.io/jointprob-shadow-cljs/memory-efficiency.html

Every time the graph is updated the size of the memory heap goes up by several MB.

Here’s my code:

Super interesting problem.

I have no idea how to solve it but I took a look at the performance profiles and it does seem like a memory leak on refresh (the number of points graphed seem pretty much irrelevant)

The call stack will be clearer you remove the google optimiser so the function names are not scrambled… but it probably won’t help solve the issue.

1 Like

I think this is related to Memory leaks with continue updated input data · Issue #270 · vega/vega-embed · GitHub and could be linked to Oz using vegaEmbed repeatedly to update the chart. Apparently Vega is retaining some internal state and expects view.finalize to be called…

Another route is to just use the View API and treat the chart as a custom component with retained state that you manage. I have done a bit of lower level work trying to get vega performant for live updating stuff typically as part of concurrent visualizations or dashboards (or in conjuction with libs like cesium). I did not go the same route Oz did (although I used vega-tools for spec parsing, and based my original reagent component off of Oz’s early example). Instead, I use the view api and retain my own internal database of the existing views (vega View objects), and then use vega changesets to push updates to the View’s data (all wrapped with cljs functions to ease the plumbing). This ends up being substantially more efficient in practice (both in rendering and apparently memory control) although there are still some walls that Vega runs into due to its internal model (e.g. the DAG that defines transforms and views has to be run on every changeset, with user-defined transforms being particularly expensive if you are not careful).

Oz could maybe try some of the approaches mentioned in the vega thread (specifically calling finalize after invoking vegaEmbed). The current problem with Oz may be that it retains no reference to the view that was created, so on component-will-update, you have nothing to invoke finalize on (this could be remedied). There is apparently a view-callback keyword option you can pass a function via, which will expose the View object.

So maybe something like:

(def current-view (r/atom nil))
(defn finalize! [res]
  (when @current-view
    (.finalize (.-view @current-view)))
  (reset! current-view res))

(defn page []
  [:div
   [buttons]
   [oz/vega-lite
    (g/point-chart
     (g/data (range) (take (:no-of-points @app-state) points))
     (g/titles  (str (:no-of-points @app-state)
                     " points")
                "point number"
                "rand number"))
    {:view-callback finalize!}]])
2 Likes

I have been looking at Hanami to see how they handle using vega-lite but can’t get it running. Here is my attempt at getting hanami working with shadow-cljs:

Would be cool if @jsa-aerial had some feedback of where I might be going wrong.

I am getting this far, to displaying the tabs:

But when I try to run this in the correct namespace:

(->
   (hc/xform ht/bar-chart
             :TITLE "Top headline phrases"
             :X :x-value :Y :y-value :YTYPE "nominal"
             :DATA
             [{:x-value 8961 :y-value "will make you"}
              {:x-value 4099 :y-value "this is why"}
              {:x-value 3199 :y-value "can we guess"}
              {:x-value 2398 :y-value "only X in"}
              {:x-value 1610 :y-value "the reason is"}
              {:x-value 1560 :y-value "are freaking out"}
              {:x-value 1425 :y-value "X stunning photos"}
              {:x-value 1388 :y-value "tears of joy"}
              {:x-value 1337 :y-value "is what happens"}
              {:x-value 1287 :y-value "make you cry"}])
   hmi/sv!)

I get an error in the console:

This current-view idea didn’t seem to help unfortunately.

@jsa-aerial can correct me if I am misreading, but it looks like the update model in hanami is following a similar style that Oz used: namely leveraging vegaEmbed and reagent component update methods:

The wrapper function visualize ends up calling vegaEmbed as Oz does, and without any finalize invocation or hook. visualize is then used inside the react class definitions for the vgl and vg components.

I expect that if the issues I mentioned before are causing the memory leak (namely Vega’s expectation of having finalize invoked to allow gc), then you will see the same problem in hanami since the reliance on vega is similar.

I noticed folks in the vega issues thread still getting memory leaks, and that it was unresolved to date. I am betting vegaEmbed is the prime suspect for the moment.

1 Like

OMG; Thanks so much for raising this issue @jamiepratt. And doubly so to @zcaudate and @joinr for helping to track down the issue. I think you’re right @joinr; Makes a ton of sense that it’s just not GCing without finalizing being called.

I have a feeling that always calling this after creating the view would be problematic for certain applications that need to register their own callbacks or signal handlers. But I think we could probably use React’s componentWillUnmount to call finalize, which should do the right thing.

Feel free to open an issue for this on GH.

Thanks again!

2 Likes

Thanks for looking into this @Christopher_Small!