Clojure should consider MoarVM as a backend

I don’t have opinions about the issues discussed above, but I’d like to understand one point that I’m confused about. I used Linux for many, many years, but for the last fifteen years my primary unix has been MacOS, and I’m not familiar with the recent Linux world.

My question is, what is it that makes JVM-based applications incompatible with Linux packaging systems? Is it just that the packaging systems want to compile from source? (That was mentioned briefly above.) If so, that seems like a policy choice of the package systems, and one that could be changed. For machine-specific binaries, compiling from source makes sense, but one of the points of the JVM was to avoid the need for machine-specific binaries. It means there’s a step that simply can be skipped.

If there is a worry that a jar file might have malicious code baked in, that doesn’t seem to me like a packaging system issue. It means that one should never download other people’s jar files at all, packaging system or not.

If that’s the concern, why can’t there be Linux package repositories with all of the Java, Scala, Clojure, etc. source needed to rebuild JVM-based packages (using something like OpenJDK)?

Some of the points above can be made about Javascript as well.

1 Like

I think this is specifically about Gentoo Linux, which is a “source-based” distribution and focuses heavily on compiling everything from source as much as possible.

1 Like

I haven’t. I’m merely pointing to MoarVM as a “juicy” compile target for any language. I’m personally learning Raku. I know my niche which isn’t language development.

Raku isn’t better than other specialized programming languages with regard to functional programming or any specific programming methodology. But, it covers baseline for various programming paradigms. It is consistent and has tools for everything. The tools aren’t as optimized, but they do the job. I’d say it is a balanced approach to language design. By achieving baseline in various aspects of a language, it achieves balance.

GNU Guix System focuses more on building everything from source. GNU Guix developers even compile JVM from source. Most linux distributions and BSD operating systems try to build as much as practically possible from source, but they don’t go as far as building JVM from source.

I was curious to see what MoarVM provides, and it does look interesting. However, there is next to no documentation on the instruction set it uses (or I just wasn’t able to find it). Do you by chance know where to find such documentation? Perl was my first general propose programming language, and I still have good memories of my time learning programming with Perl.

How is packaging Python apps any easier exactly? I feel Python has all the same issues as JVM, Node, Ruby, etc.

I’d also like to know about Raku, how does it get around this issue of Linux packaging?

From my current understanding, only C/C++ is compatible with Linux source distributions. Because C and C++ do not have their own dependency management. That means for years it relied on the OS and the user installing all dependencies that need to be linked at runtime, or it simply chose to bundle everything statically linked. Because of that, the C and C++ ecosystem just doesn’t have a lot of libraries, and when it does, they are massive, as to limit the number of dependencies, and the OS packagers have done all the hard work of having all possible dependencies already available in them. It’s not that’s it’s easier, but that a lot more effort has always gone down into them, so you find it easier to add one more since chances are all the dependencies will already be there for C or C++.

Another thing is I feel some of it is ideology. I know that sounds weird, but hear me out. You want to compile everything from source, but you allow the C compiler itself not to be compiled from source. Obviously something has to bootstrap things, you can’t all compile from source, what would do the compiling? You need to start with an already bootstrapped compiler somewhere. For some reason Gentoo says the C compiler is fair game, maybe even the C++ compiler? Aren’t they the same? Anyways, so why don’t you allow the Clojure compiler the same benefits? The JVM is effectively the Java and Clojure compiler, why can’t you allow to have it bootstrapped not from source the same as you’d allow the C and C++ compiler? It seems simply unfair and out of a strange principle not to do so which I just don grasp personally.

Anyways, that’s my 2 cents.

The bootstrap starts with very simple short compilers hand written in assembly. So that assembly is your directly runnable source code

Feels like it was more true pre-deps.edn, no? Now it’s really easy to point to github for your dependencies. If Clojure itself were bootstrappable (sounds like that’s difficult) couldn’t you just stick to source code dependencies? Granted only the simplest projects don’t leverage any Java libraries on Maven… so it’d really limit what Clojure programs could be packaged. But you could make full desktop apps with OpenJDK, JavaFX Clojure and cljfx for instance.

You are supposed to target NQP instead of MoarVM. NQP is a small language that you can use to write compilers. NQP can be compiled to MoarVM and JVM.

1 Like

JVM and nodejs and deno don’t have any standard for flat “non-recursive” directories for system modules. node_modules is recursively structured. deno has no notion for system modules. JVM doesn’t, either. Python, Ruby, Perl, Raku, and Haskell come with flat non-recursive system module directory. That’s why they seamlessly integrate with linux package systems.

Build tools for those languages merely install modules into user module directories or the current project directory for “development” purposes. For permanent system deployments, they provide tools for installing modules into system module directories.

I prefer languages that are bootstrapped from other languages because this eases porting to various platforms and minimizes trust in binaries. C is actually bootstrapped from assembly. Ideally, languages bootstrapped from assembly should be used to bootstrap higher level languages. This way, assembly language directly and indirectly bootstraps everything.

Binary compiler is not a deal breaker. The real deal breaker was lack of integration with linux package systems.

When an uberjar or a bundle of some kind is the right deployment model, I have little problem with JVM or javascript ecosystem, but bundles haven’t been the right deployment model for a long time personally. For writing code for web pages, deno and nodejs are perfectly fine. But, I don’t need to write code that runs on web pages.

You might wanna check out luajit. It’s good for both systems as well as web pages via openresty. If you just want a lisp for raku, check out mal/impls at master · kanaka/mal · GitHub.

What aspects of clojure are must haves for you that warrant a port? Do you need immutable datastructures? Do you need the repl? Do you want to copy and paste clojure code and have it running on moarVM? I dunno, it seems a bit facitious talking about os file formats when you’re asking to layer a completely new runtime onto a tech optimised for perl. I’d be more excited if you suggested clojure on something like haxe.

It’s worth noting the immutable data structures do have costs associated with it, as well as adding a secondary runtime. It’s possible, but my question is still very much why?

Languages don’t compile to lua bytecode because LuaJIT bytecode changes frequently. Languages compile to lua. It would be better if there was an intermediate language like NQP(Not Quite Perl) for lua.

mal is a learning tool. It is not trying to be a serious language.

Nothing warrants a port. I’m just pointing to MoarVM as a juicy compilation target. I don’t tell people what they must do.

Personally, I like to use clojure as a scripting language on MoarVM, but there is no rush to use another language on MoarVM because I have no problem with Raku. Raku is a serious scripting language designed to last a century. I believe clojure was also designed by Rich Hickey to last long.

MoarVM is cool because it has module repositories for users, administrators, distribution packages, and raku core. For development, I install modules into user repository. Linux packages install modules into vendor repository. Various use cases are already accounted for on MoarVM.

I don’t know much about haxe. I will look into it.

I meant for Gentoo. Are you saying Gentoo uses an assembly written C compiler to then compile an old version of GCC that is then use to compile the latest version of GCC?

I highly doubt, they probably use the GCC build instructions which use an older version of GCC to compile. Which is a little similar like Clojure needing an older version of Clojure to build the new versions with.

I’m not sure I follow exactly. Can you walk me through like how you’d package a Python application for Gentoo?

The standard for Java (at least pre 11), is that you can put all your modules in one folder and set your CLASPATH environment variable to that folder.

So the folder could be:


And inside it you’d have:


So Gentoo could simply compile all libs and put them in that folder and then set CLASSPATH.

But the problem, and I feel Python has the same issue, is with regards to versioning. Different app might need to use different versions or they might not work.kf the wrong combination of versions is used. Which means you’ll probably need per app version definitions, and instead of setting the global CLASSPATH you’ll need one for each app you launch.

Can you show me where this is done in Gentoo? I’d be really surprised. Pretty sure it’s bootstrapped using GCC, not using an assembly C compiler.

This is where you lose me. If I want to release a Python application on Linux, how will I specify the exact versions of libraries I need for my application to work correctly at runtime? How will that not conflict with another application that depends on a different version then me?

Other then that, you should be able to package Java and Clojure apps just fine. Like I said, you can easily just dump all the built packages in one big folder and add it to the CLASSPATH. Then run any Java or Clojure app and it will find the packages from that folder, but getting the versions right will be the challenge.

Otherwise, to handle versions, you can put each version of each lib in a subfolder of that version, but then when you launch the app in question you need to override for it which versions it needs to use on a per app basis.

Okay… I’m wrong about NQR as it’s the perl version of rpython. It’s kinda cool though perl is too much like bash for my liking - but I guess that’s probably the reason why it’s going to last.

Though targeting a metavm might sound good in theory, I haven’t seen any decent cross language metavm implementations bar the language that it’s written for (raku, pypy). Transpiling is the simpler and more practical option.

Plus it seems that raku is a bit of a bitch to compile from source

I’m not sure why you’re dismissive of of luajit/lua. Luajit’s small enough to be compiled in about a min from source, the vm is great and is almost c-like in speed because the language is so simple. it’s used it in a ton of games, also neovim and openresty (2nd most popular web framework at the moment).

And there’s luarocks for packaging… which may or may not be what you need. It’s also installs from scratch in under 10s.

luajit also has fantastic performance for numeric code - approaching/beating GCC and Julia in some cases.

+1 here for Lua. We have considerable success working with Lua.

@cnuernber are you using torch or just straight ffi?

Yes. Mike pall deserves all his flowers. I use a modified redis that is about 30-1000% faster for EVAL (depending on the lua code).

This was at NVIDIA - our UI system was really a specialized (real time realistic materials such as carbon fiber or car paint) graphics engine with lua driving a state-chart system, I haven’t yet built dtype-next FFI bindings to luajit as we have them for julia. I do, however, keep track of the Julia performance metrics and as I mentioned LuaJIT often scores extremely well.

Yeah, Julia’s got a fantastic ecosystem - and really switched on users and devs. This talk really caught my attention regarding Julia. It’s just a super practical application of probabilistic programming to something that I wouldn’t have even thought about being related to that field.

The only downside for me (as it may be an upside for others) is that the language is pretty big and it’s typed. So it can be a bit of a struggle against the compiler. I also find that the startup time can be really unpredictable (not sure what the jit is doing). I’d actually prefer if there was a way to write Julia asts directly and send it to the interpreter.

Speaking of jits, emacs just released native compilation with a bunch of features including async compilation and hotloading. It also now comes with it’s own react-style framework GitHub - ebpa/tui.el: An experimental text-based UI framework for Emacs modeled after React - **requires emacs 26.1** so it may just end up being the best way to distribute code, especially dev tooling.

There are functions used by gentoo python packages. They install packages into site packages folder.

Gentoo allows multiple versions of the same package to be installed. But, each python application can use one version of each python library. That cannot be worked around because that’s python’s limitation. The convention in python community seems to be to attach major version number to python package names such as dev-python/PyQt4 and dev-python/PyQt5.

Python has the notion of site package folder.

On MoarVM, one application can depend on multiple versions of a MoarVM module.

It seems GCC is self-compiling. But, it still uses a lot of assembler code in machine descriptions.
While self-compilation is not a deal breaker for me, I want to minimize the number of self-compiling compilers on my system. Fortunately, Raku bootstraps from NQP which bootstraps from C and other languages.

JVM wasn’t designed to put everything in one entry in CLASSPATH. Gentoo doesn’t do that. I don’t want to use JVM in ways that weren’t intended by JVM designers.

And, if I dump everything in one entry of CLASSPATH, I would lose ability to install multiple versions of one JVM library.

There are complex attempts to make up for JVM’s poor module system, but they haven’t been adopted widely due to complexity.

Raku is bootstrapped from NQP which any language targeting MoarVM should compile to. Transpiling is simpler only when the target language is not a moving target.

Transpiling clojurescript to javascript was a disaster because it was a moving target. Any intermediate language designed as a compilation target tries not to be a moving target.

Wrong. In practice, it’s easy to compile raku from source because it is bootstrapped by a lower language, NQP which was designed as a target language for compilers. I read gentoo source package for dev-lang/rakudo, and it is very simple.

Any language compiler that is bootstrapped from other languages usually isn’t hard to build.

There are already packages for moarvm, nqp, and raku on gentoo linux and debian.

It was generally easy and simple for me to create gentoo packages for raku modules.

I don’t dismiss lua. I just prefer MoarVM because it has a more advanced module system which I want.

It took 2 minutes to build moarvm, 1 minute to build nqp, and 4 minutes to build rakudo.
In total, it took 7 minutes to build moarvm, nqp, and rakudo. That’s still not much.

Lua was created and marketed as an embedded language rather than a system scripting langauge. Raku was designed for system programming. Thus, its module system is more suitable for system programming.

If lua tries hard not to be a moving target, it would be a great compilation target for other languages. But, I don’t know much about lua. I don’t know whether it tries not to be a moving target.