Increasing compilation performance

mtruyens · February 23, 2020, 1:27pm

As the size of my codebase keeps expanding (currently around 120K LOC, split between CLJ/CLJC/CLJS files), the compilation time keeps expanding as well. Particularly in CLJS — but also when refreshing all namespaces in a CLJ REPL — I frequently have to wait up to 20 secs. We are obviously spoiled as compared to other languages, but still I find this breaks my development rhythm.

Do any of you have interesting suggestions to truly optimize the compilation speed? My development machine is an 8-core iMac Pro, and I use OpenJDK 8, Cursive, Leiningen & ShadowCLJS. For various reasons, I must use OSX as my main environment.

I wonder whether any of the following would be good ideas:

Compile & REPL on a more powerful (e.g., ThreadRipper) Linux box. Anyone has benchmarking data for CLJ/CLJS projects on how additional cores could make a difference? I am somewhat hesitant, because of the overhead of using an additional machine. Also, if I understand correctly, IntelliJ doesn’t like having its working files stored on a networked folder.
Experimenting with another JDK and/or with the JVM-settings, whether or not on OSX. Anyone have experience with Azul or own compilation like http://august.nagro.us/optimized-openjdk.html?

By the way, is there a reason why CLJ-compilation seems single-threaded? Kind of ironic for a language focused on multi-threading, and given that CLJS-compilation is partially multi-threaded…

Many thanks!

Maarten

thheller · February 23, 2020, 8:56pm

I’d be interested in the CLJS recompile times you see in watch?

release will likely take a long time due to :advanced optimizations and there isn’t much you can do to speed that up. It is single thread so only core speed counts.

CLJS compilation does use multiple threads but it is unlikely that you see high gains by going to more cores. 8 is plenty is it is all dependent on your namespace setup. So unless you have a whole bunch of namespaces that don’t depend on each other your 8-core is fine.

As for CLJ I very much doubt that you need to refresh all namespaces? It should usually be fine to redeclare a single var or just load-file to refresh a single ns. Reloading more typically only is required if you declare protocols/deftype and have instances of those in your state. For that it can help to move the protocols to secondary .protocols namespaces that you don’t change that often. CLJ loading is single thread so nothing can really be done on that front.

mtruyens · February 23, 2020, 10:24pm

Hi Thomas,

Thanks for the reply on a Sunday.

I am mostly interested in the non-optimized build process, because of the edit-compile-run cycle.

I indeed have quite a lot of namespaces depending on each other, so that would argue against higher cores. However, with about 191 CLJS files and 100 CLJC files to compile, I would imagine that there should be some level of parallellism possible. Hence my question whether anybody had experience with very beefy machines.

(In any case, I am eagerly awaiting your new release of ShadowCLJS, in which the double-compile macro bug is resolved, which will speed up some things for me!).

As for CLJ, I often have to refresh namespaces because of defrecord’s inside them. Since my code does a lot of (instance? …) checks, which are then also stored in Specs, simple refreshes of a few namespaces tend to leave behind stale references to the old classes, causing the (instance? …) checks to fail. I sometimes wonder how other Clojurists deal with this issue of stale classes, so any tips here would be appreciated.

didibus · February 24, 2020, 1:45am

Wow, 120k LOC! I remember a talk a few years back when someone was claiming 34k LOC was the biggest Clojure code base ever.

Unfortunately, I have no advice for you really, as I don’t have a single code base that is remotely that large.

linpengcheng · February 24, 2020, 3:35am

(diagnostics for Boeing’s 737 MAX) Built using more than 34,000 lines of Clojure code, it is one of the largest Clojure code bases to date.

Then my project Lin Pengcheng Financial Analyser once claimed to be the largest pure Clojure(jvm/script/clr) project, and now it is still the largest personal pure Clojure project, It has a clojure(jvm/script/clr) code close to 100k LOC. It is also one of the few projects written on the three platforms officially supported by the Clojure language

thheller · February 24, 2020, 9:40am

2.8.84 is out now.

I’d still be interested in your watch recompile times. Large projects like yours are rare and unfortunately none of them are open source. It is hard to collect data on how things might be improved for larger projects.

One thing I sometimes do is moving defprotocol,deftype,deftype into their own .protocols namespace. Even clojure.core does this. If you have the deftype, defrecord in such a namespace you can still use extend-type elsewhere to implement/change actual functionality. Unless you need to change the fields you don’t have to reload that ns. This isn’t perfect either but saves some reloads.

ShiTianshu · February 24, 2020, 4:05pm

Out of topic, but I wonder how much ram it needs to running all these stuff.

mjmeintjes · February 24, 2020, 6:17pm

I also do the separate protocol namespaces pattern, and also wrap my protocol definitions in defonce. It seems to work as we haven’t had any mismatched class bugs since I started doing that.

mtruyens · February 24, 2020, 10:11pm

Wow, this new version of Shadow-CLJS (which avoids recompilation when macros are used) really makes a huge difference for me. Suddenly most changes stay below 1.3 seconds, and even changes to utility-namespaces that are used by many many other namespaces, stay below 5 seconds.

THANKS!

Some sample output from my watch (if this is not what you are looking for, please let me know!):

[:app] Configuring build.

[:app] Compiling …

[:app] Build completed. (539 files, 538 compiled, 0 warnings, 32.72s)

[:app] Compiling …

[:app] Build completed. (539 files, 7 compiled, 0 warnings, 1.36s)

[:app] Compiling …

[:app] Build completed. (539 files, 4 compiled, 0 warnings, 1.95s)

[:app] Build completed. (539 files, 68 compiled, 0 warnings, 4.57s)

[:app] Compiling …

[:app] Build completed. (539 files, 68 compiled, 0 warnings, 4.79s)

[:app] Compiling …

[:app] Build completed. (539 files, 10 compiled, 0 warnings, 1.18s)

mtruyens · February 24, 2020, 10:14pm

Hi, I also use this technique of isolating defrecords & defprotocols (and related specs) in separate namespaces. Indeed, it works nicely in avoiding class bugs.

However, since I frequently have to make (even if just minor) changes to the Specs, I do have to reload the namespace, and then also several other namespaces. Hence the slowness question.

mtruyens · February 24, 2020, 10:17pm

About 3 to 4 GB of RAM is necessary for the advanced compilation of the CLJS files. Also, on most machines (including my cloud server), the advanced compilation takes about 150 seconds, which isn’t so bad. The results is a single JS file of about 6 MB in size (I have not yet undertaken many efforts to optimize the size, as users tend to only reload it infrequently).

RAM usage during production use is between 500 MB and 1500 MB. It is a system for storing contractual clauses and producing PDF/DOCX files, so not very heavy RAM intensive on the CLJ side.

jumar · February 25, 2020, 9:57am

Is this an open source project? I can only see the README file.

linpengcheng · February 26, 2020, 2:45pm

It isn’t an open-source project.

Although Github is an excellent platform for IT professionals to showcase their talents and share results, I feel that the platform (or the IT community) lacks the ability to actively advertise the achievements of contributors to end-users, As a result, the honor they receive is far lower than their contribution.

This project provides such a platform, through free full-featured publicity version, thanks list, rule naming (default submitter name, submitter named naming, rules I collected and named other helpers), thank list information Watermarks, references, and other strategies. Proactively publicize and thank the people who helped me and this project. In this way, This can achieve a virtuous circle that is beneficial to contributors, users, and developers.

Build a large-scale and influential worldwide financial analysis platform and standard library of financial rules to fill a world gap.

The closed source can achieve this more effectively.

jwr · February 27, 2020, 9:57am

I can definitely relate, having very similar problems with a sizeable clj+cljs application (I am at roughly 50kLOC). Development on a laptop machine is something I’ve pretty much abandoned, because things are simply too slow. For a while I tried using cloud servers (similar train of thought to yours), and found that there are HUGE differences between the actual CPU performance that you get from cloud providers. I wrote up the results in a blog post: https://jan.rychter.com/enblog/cloud-server-cpu-performance-comparison-2019-12-12 in case anyone is interested. I found that using Docker and syncthing it’s possible to get a remote-development environment where you offload your compilation but still keep your editing on your desktop, but I use it only if I’m stuck without a powerful desktop. It’s always better to have a desktop workstation.

My net takeaways were:

Single-core performance is what matters most for interactive development.
Intel Xeon processors are dog slow. They are mostly good for cloud providers, because they let them sell overbooked capacity with a minimal hardware investment. Don’t believe the marketing hype, benchmark for yourself.
Intel desktop processors are the best option. If you can live with Linux (I do sometimes, it’s annoying but usable), get a top of the line Intel desktop chip and you’ll have a speedy workstation.
In the cloud, get a physical machine with an Intel desktop processor (like Hetzner’s EX62-NVMe) or if you want to rent by the hour, go with Vultr or Digital Ocean for the best price/performance.
In the Mac world, avoid the iMac Pro and get an iMac instead, if you can live with the occasional fan noise. The iMac is roughly half the price and will actually be faster than the iMac Pro for the workloads we’re talking about, in spite of what you might hear from podcasters, etc.

After doing the benchmarking, I just bought an iMac with the i9-9900K and do most work on that.

Needless to say, I am immensely thankful to the ClojureScript team whenever there are improvements in compilation speed

mtruyens · February 27, 2020, 3:44pm

Very, very useful information. Many thanks!

I can agree with your assessment of the iMac Pro versus iMac. About two years ago, the regular iMac was stuck at some older processor, so I had to bite the bullet to buy the very expensive iMac Pro. It seriously hurt financially (we’re a small startup), especially when you consider that the machine is at the same time too fast (most cores are sitting idle most of the time, and the graphics card definitely is) and not fast enough (for single-threaded Clojure work). You’re constantly thinking that you paid too much because developers were not the target audience for this machine — it’s clearly built for video editing.

That being said, I really really like the machine. I have never heard its fan and extremely stable, and maybe not the fastest but still quite fast for what it’s doing. I’d probably buy the regular iMac nowadays, but still I’m a satisfied customer.

l3nz · February 28, 2020, 9:19am

Not sure this can be counted as a success

linpengcheng · February 28, 2020, 9:59am

Boeing’s 737 MAX is a design error, not a fault. So it can’t say diagnostics system is not successful.

system · August 28, 2020, 9:59pm

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.