Clojure should consider MoarVM as a backend

zcaudate · March 14, 2022, 4:47am

It seems you really like perl - respect.

I think that the transpiler for cljs is it’s best part. The issue I have with cljs is with the google closure dependency as well as the language mismatch (ie. JS has both undefined and null and it’s intrinsically async). It’s not the transpiler. But if we follow that logic, NQP and moarVM is also a moving target because you’re going to be writing to target perl. You’d be better targeting LLVMIR which is what emacs is doing.

The point I was making about metaVMs with RPython as the comparison. Even though the claim is that you write a compiler and it can target everything, the reality is that it adds a tremendous layer of complexity. RPython was written to build a runtime faster than CPython. The end result is PyPy and it’s great. There is also a clojure implementation written in RPython called Pixie. But RPython is too much like Python → just as NQP is too much like Perl and the idiosyncrasies of the parent language can leak into the metaVM implementation (ie, python dicts throw an error when accessing a unknown key instead of returning a nil). So I’m just suspicious of this approach. I’m not familiar at all with Perl but your initial ask to write clojure in NQP will have the same issues as Pixie.

catdog · March 14, 2022, 6:43am

MoarVM bytecode is a moving target, but Raku itself released a stable version called 6.c. It reached stability. NQP is going to be more stable than Raku because Raku can change syntax outwardly without changing semantics. If you want to target MoarVM, NQP is the most stable target.

Javascript was created in a rush and had to add ES6 module on top later. Introduction of npm modules and then ES6 modules broke clojurescript and google closure compiler. Unlike javascript which was rushed into web browsers by big tech, Raku had two decades to get everything right with its own virtual machine. NQP probably is not going to introduce a lot of breaking changes from now on.

Python has its own virtual machine, but Python virtual machine is not separate from Python. MoarVM was designed for Raku, but it is separate from Raku so that other languages can target it. MoarVM is actually designed from the ground up to host multiple languages. Lua and Python weren’t designed to host other languages. That’s why I prefer MoarVM.

I liked pixie, but it’s dead. Hy language displaced Pixie. When Hy language reaches stability, I might try it. Hy language resembles clojure and retains Python semantics except let construct.

A clojure-like lisp language on MoarVM may not have big problems if it doesn’t try to deviate from Raku semantics. PureScript is a Haskell-like language that retains javascript semantics and has a lot less problems with javascript environments than GHCJS does.

zcaudate · March 14, 2022, 8:36am

Honestly… I doubt it’s going to happen unless you do it yourself. Which I strongly recommend attempting because it’s not as hard as you might think. The hardest parts in my opinion are the reader and the immutable datastructures but there’s plenty of reference implementations - Pixie being one. Since you’re already familiar with Perl, it should be easier for you do write it than anyone else on this site.

zcaudate · March 14, 2022, 9:16am

I feel like in the evolution of languages so to speak, lisp is the primordial ooze. It’s just the AST.

Languages can spring out of that ooze and become fish, or birds, or mammals.
They will look back at lisp and say - wow… that’s a lot of parens, macros and useless muck they have back there.
Lispers look at those fish, birds, mammals and say these languages will never evolve ever again. Sometimes it’s the truth, sometimes it’s just out of sheer laziness.

Languages are overrated in my opinion. Runtimes on the overhand are very much underrated. JS is what it is today because of the browser, not the other way round. It’s a similar story with Clojure and the JVM.

So if you are betting on MoarVM, then do it. Don’t let a current language stop you from inventing a new one.

catdog · March 14, 2022, 9:48am

Lisp macros are miles better than TemplateHaskell and C macros.

My niche is application development. I’m okay with writing applications in Raku. I’m just saying that MoarVM is a juicy compilation target for language developers. Maybe, someone will read this thread and decide to write a lisp language on MoarVM in the future. That’s about it.

didibus · March 14, 2022, 6:02pm

Weren’t you having issue compiling the JVM even though it is bootstrapped from C?

I’m pretty sure it was intended to work as such. I mean, it is a documented feature. That’s how the CLASSPATH works. The CLASSPATH is the path to the root package folders of all dependent packages.

Maybe Gentoo did it wrong, and that’s why it seems hard to package Java libs?

How does Gentoo manage this for Python and for C ?

I’m assuming it has its own scheme, maybe using symlinks or it uses bundles?

I’m not really seeing what is complex, and I don’t see what is different for JVM as other languages honestly. Maybe people are trying to make things more complex then it needs to be?

A Java application looks for its dependent packages by looking for them on each path from left to right defined in CLASSPATH environment variable + those passed to the command line launcher -cp argument.

If I were packaging it for Gentoo this is what I would do:

First I’d set the CLASSPATH environment variable to:

lib/*:/lib/java/current/*

That means that when a Java application runs, it will look first in the relative ./lib folder for all Jars, and if it can’t find the packages in those Jars, it will then look at the Jars inside /lib/java/current folder.

Now when you install a lib from Gentoo package manager, what it would do is put the Jar for it in:

/lib/java/<libname>/<version>/libname.jar

Finally I would use symlinks to set the current global versions:

/lib/java/current would contain symlinks to all Jars that I want currently installed globally, for example:

/lib/java/current/supermath.jar -> /lib/java/supermath/2.3.0/supermath.jar

So now I’m using supermath version 2.3.0 globally.

Finally, if you install a Java application (so not a lib), what I would do is put them say in:

/app/java/<appname>/<version>/appname.jar

And they have an accompanying ./lib folder with symlinks to their pinned versions they need to override.

/app/java/<appname>/<version>/appname.jar
/app/java/<appname>/<version>/lib/supermath.jar -> /lib/java/supermath/1.4.2/supermath.jar

And finally a similar current version to specify the current version of the app I want:

/app/java/current/<appname> -> /app/java/<appname>/<version>/

And I’d add /app/java/current to my PATH environment variable.

The magic is in the CLASSPATH:

lib/*:/lib/java/current/*

It will first look in the relative lib folders for Jars, and only if it doesn’t find the package in them, will it then look in the global /lib/java/current folder.

catdog · March 15, 2022, 11:56am

Raku language just comes with user repository, site repository, vendor repository, and core repository.

With those fixed repositories, I don’t have to think about CLASSPATH. Raku repository layout is portable across many operating systems because it is baked in the language itself.

Perl was specifically optimized for UNIX system application programing. JVM wasn’t.

That’s why I’m learning Raku even though I prefer clojure for a programming language. My niche is going to be high-level system programming, command line programs, terminal user interface, and web backends.

didibus · March 15, 2022, 4:39pm

You mean that the directories Raku looks for dependencies are hard-coded and defaulted? Is that correct?

From my quick Google, it also seems like there is a RAKULIB that seems pretty similar to a CLASSPATH as well.

It also looks like it has its own repository of modules (not Linux distro based), similar to Clojars, and it uses zef as a dependency manager to install modules from it, as opposed to using the Linux package manager by default.

Anyways, I didn’t really understand what prevents Gentoo from doing what it does for C and C++ libs and others to also do for Java and Clojure. And I’m not sure how Raku is easier to integrate with Gentoo, but I’ll trust you there.

catdog · March 16, 2022, 5:58am

You probably haven’t packaged python and perl packages… I have.

I tried to package JVM modules, and making linux packages for JVM modules requires creation of a new module system that sits between JVM and linux distribution. Creating such a new module system is a big task for one individual. Even with such a module system, linux package maintainers have to understand intricacies of ant, maven, sbt, gradle, and other JVM build tools. I tried to understand gradle so that it doesn’t try to download anything from the internet during build phase, and I failed. I don’t have time to understand ant, maven, sbt, gradle, leiningen, etc, …

Packaging a python/raku/perl module for a linux distribution just requires understanding of an existing infrastructure laid out by language designers.

Raku provides install-dist.raku which I can execute to install any raku module onto an arbitrary prefix. zef is used for development purposes. install-dist.raku is used to install modules onto operating systems.

For a language to be suitable for system programming, it has to provide infrastructure for installing modules onto specific predetermined directories, and its build tools should not try to download anything from the internet.

Perl, Python, and Raku fulfill the requirements. JVM languages don’t even try.

didibus · March 17, 2022, 2:18am

That’s true, but at my work we have our own dependency manager for Java and Clojure. We don’t use Ant, Gradle, Maven, Lein, we have our own package manager. So I have experience with that. It’s tedious having to reconfigure every dependency, but it’s generally straightforward.

Like I said, it’s mostly get the source, call javac with the source folders and the -cp pointing to the jars it depends on. Then zip the compiled output in a jar file.

For Clojure, it tends to just be zip the source files in a jar and that’s all you have to do.

Then what we actually do at my work is we put all the resulting jars in one folder where the name of the jars is <libname>.<version>.jar

When we bundle an app, we do the same, the difference is it comes with a shell script as the app launcher. All the script does is call java -cp with the list of jars the app depends on.

We have our own config file to tell the package manager what something depends on. It’s just a file that lists the libname and the version.

We then store all the jars on a server. So what happens say we want to deploy an app to a machine, you specify the name of the jar and its version, and inside the jar there’ll be that config file that tells what all the other jars and their versions it has to pull down on the machine.

And that’s it.

In theory you don’t even need the jar, you could replace them by folders, but we find them convenient over folders.

When you develop, it’s the same thing, just provide the CLASSPATH you need for your tests or your REPL. You can also create a local folder and symlink to the jars you depend on, and then make your CLASSPATH that folder.

The only thing that’s challenging is figuring out the dependencies of your dependencies. But our package manager does that, you can run a command that will return the dependency closure, which is just the list of all libname and their versions transitively resolved, if there are conflicts it defaults to the higher version.

You can easily get the CLASSPATH from that, it’s just:

(->> deps
  (map #(str "/lib/java/" (:libname %) "." (:version %) ".jar"))
  (interpose ":")
  (reduce str)

I’m assuming that something similar would be achievable with ebuild and the Gentoo packager.

I am also curious how say Python handles the transitive dependency issue? And how it handles versions? In Python I’m only familiar with pip, and that’s pretty similar to Maven in my experience, it definitely connects to the internet and downloads dependencies from the Python repositories, how do the ebuilds work for this?

catdog · March 17, 2022, 6:33am

Something like your company’s customized build tool for JVM should have been baked in JVM ecosystem. That’s why JVM isn’t optimized for system application programming.

If JVM was trying to optimize for system applications, it would come with its own build tools and module standards for operating systems.

Don’t expect linux distribution maintainers to write such a thing. It’s not going to happen.

With perl and raku, you don’t fiddle with CLASSPATH. They come with “fixed” packaging assumptions for operating systems. POSIX operating systems just use what they have.

As far as I know, the convention is to expect only one version of each python module to be present on any system. That’s why there are PyQt4 and PyQt5 instead of just PyQt.

Python has multiple packaging standards for operating systems. Those tools are independent from pip.

My conclusion is that not every programming platform is suitable for every job. JVM isn’t suitable for system application programming. The debate is over. Don’t try to use JVM for system applications. It’s clumsy for system programming.

In theory, it’s totally possible to create such a thing, but such a thing should be baked in JVM itself. It should be written by JVM maintainers. You are free to work on it.

I just want to write system applications instead of devising a whole new packaging standard for a language before even thinking about writing applications.

You seem to want to use JVM for as many things as possible because clojure is not very useable outside JVM, but sometimes, you have to admit that JVM isn’t optimized for many niches. It has been extensively optimized for enterprise backends. Beyond enterprise backends, there are better alternatives as I have described.

I have no fealty to JVM or Oracle. I don’t need to pour hundreds of my own hours into making JVM seamlessly integrate with operating systems for system applications.

didibus · March 17, 2022, 5:27pm

We might need to agree to disagree on that one.

I’m not really sure what you mean by system programming though. If you imply in order to write scripts for managing an OS, I’d agree it’s not ideal as a scripting runtime. That’s why I use Babashka instead for my scripts with Clojure.

If you mean to write desktop applications, I’d disagree, JVM is an okay choice for that, look at IntelliJ, JWorkbench, Eclipse, Minecraft, Azureus, FreeMind, Spark messenger, etc. All work on Linux. I’m not sure I’d choose it personally over something web based like Electron, but it’s definitely viable.

If you mean embedded, well there are some specialized versions of JVM for that, good examples are Symbian OS and Android OS both are Java bytecode based, and all their apps are Java based and running in a Java VM.

If you mean to write command line applications for Linux, I’d also disagree, I would personally compile them to native with GraalVM, I don’t really want my command line utils to depend on other things at runtime, I prefer them as self-contained static binaries. Plus doing so gives you super fast startup, low memory usage, and you’re getting performance only outdone by C/C++/Rust.

Now if you mean it as can be easily packaged for Gentoo, well it seems I’d agree with you, while I’m not sure I understood what made it difficult to do so, clearly since you struggled, it isn’t as easy as it could be.

I think that be for Gentoo maintainers to spend the time. My conclusion from all this convo isn’t that JVM is harder to package, just that Gentoo maintainers dont care for it. Naturally they seem to care about C, C++, and the traditional Linux scripting languages like Perl and Python.

The Java and Clojure maintainers have done the smart thing, they’ve created a package manager that works on all OS and on all Linux distribution, and works identically across all of them. That way by maintaining a single package manager, you can develop and install Java and Clojure libraries and applications on all OS and all distros. Instead of having to duplicate their effort for each OS and each distro.

And similarly, I think the JVM maintainers made the right choice, that user applications will be packaged as self-contained bundles. This clearly is a better user experience, as seen by the popularity of iOS, Android, Windows and macOS as a user operating system. And the growth of similar systems for Linux desktop such as Flatpak, Snaps and AppImage.

This seems like a constraint you’ve put on yourself. You chose to limit yourself to using Gentoo’s own package manager for everything. All power to you if you want, but your statement is only true with this arbitrary constraint.

You can easily use Maven and tools.deps to start writing Java or Clojure libs, scripts and apps, and it’ll work the same no matter your OS or distro. So if you ever switched to Ubuntu, RHEL, Debian, Open Suse, you won’t have to re-learn anything or change your workflow, or your package config.

And if you want to distribute your lib to other OS and distros? Instead of having to package it for Gentoo, then Ubuntu, then Debian, then OpenSuse, then RHEL, then macOS, then Windows, etc., instead you just package it for Maven (if a lib) and you’re done, or you bundle it (if for an app) and you’re done.

catdog · March 18, 2022, 8:14am

You don’t package a module for various operating systems. If you make it easy to make an OS package, it would take less than 10 minutes for an OS package maintainer to make a package out of it. In case of haskell packages, there is even a program that automates creation of gentoo packages out of haskell modules.

MoarVM has zef which is even better than all JVM build systems and all python build systems.
MoarVM module system is the smartest thing there is.

MoarVM can handle multiple versions of one module and one module written by different authors without recursive module directories like node_modules. MoarVM can handle installing modules via zef for development purposes without integration with an operating system. If MoarVM modules need to integrate with operating systems, install-dist.raku is used.

Personally, I use things like pip and zef and haskell stack only for development purposes. Bundles don’t really integrate well with specific operating systems.

I argue that MoarVM module system is what all python build systems and all JVM build systems should have been in the first place. MoarVM module system beats perl module system which is still better than python build system and JVM build systems.

I personally don’t want JVM build systems as substitutes for proper OS packages because they try to homogenize operating systems. I want to respect and explore different ways that operating systems deal with packages. Nix and Guix are still better and more powerful than JVM build systems in many ways if you want cross-platform packages.

Good luck with any JVM build system adapting to each operating system’s build configuration. If a build system doesn’t adapt to an operating system’s build configurations, programs are going to have glitches. I have experienced annoying glitches due to lack of integration with operating systems.

Development build tools like ant and maven and gradle and leiningen aren’t going to integrate seamlessly with operating systems for deployments of any non-trivial scale. There will be glitches with things like Gtk and Qt and Pipewire because JVM build tools don’t account for presence or absence of those things.

JVM build systems aren’t going to be as advanced as gentoo’s portage build system or GNU Guix. Don’t try to make a tool do everything by trying to homogenize deployment environments. I want to explore cutting-edge package systems like GNU Guix without being dragged down by JVM build systems.

I choose operating systems and programming languages partly because I like their packaging systems. You haven’t tried to install programs that depend on libraries written in other languages. JVM build tools aren’t designed to handle installation of dependencies written in various other languages or integrate with packages installed by an operating system.

Language build tools and operating system packaging tools have their places.

If you don’t care about integration with operating system environment, then you are probably just trying to bulldoze and homogenize operating systems. Big tech companies like google and apple may really want to bulldoze linux distributions so that there is really no choice between linux distributions. That’s a way to introduce various vendor lock-ins.

didibus · March 18, 2022, 5:49pm

I see this is your opinion, but I’ve been trying to understand why you have this opinion, what about MoarVM modules is better and simpler than Java’s module system, and I have not been able to understand that from our conversation.

My quick Google on zef, it seems relatively standard and doing nothing special I could find. As for install-dist.raku, I wasn’t able to find any info about it. And I also couldn’t really find any details of MoarVMs module system, how it loads things, where it expects them, how it handles versions and version conflicts, etc.

Similarly, I haven’t been able to understand what about the Java module system you think is bad? I understand you found it difficult to package Java libs and apps as Gentoo packages, but why? What about it was difficult? And in what ways are the same challenges not faced by other languages?

There’s pros and cons of allowing multiple versions to coexist at runtime in your application.

The pros is that you seemingly feel like you can just let each dependency depend on the exact type and versions it cares about and you don’t have to deal with conflicts as such.

The other pro I know, is if you want to load things at runtime, that were not specified at start time, like let the user add more modules at runtime, it can also handle version conflict at runtime. This use case is normally why in JVM world people would use the OSGi module system instead, since it supports loading multiple versions at runtime.

The cons is that it can make it slower or more difficult to handle security issues, because people won’t be as quick to upgrade to the latest versions. So chances are if you force a newer dependency for your dependency it might not work. Normally you’d complain to the dependency to upgrade before a security risk is found, simply because you might want to use a newer version. But if you allow multiple versions at runtime, you wouldn’t complain, since they’d not need to upgrade for you to use the newer version. But if the old dependency they use introduces a risk, you have less time to react. A good example is the Log4Shell issue that happened recently. It was already a mess, but if all apps used multiple versions of log4j it would have been even more of a mess.

Another cons is that it grows the size of bundles and total packages that need to be installed on your machine, since it’ll tend towards needing multiple versions of multiple libs.

The last cons I know is that it hurts interop and serialization. Basically if the module that depends on old version wants to interact with your code or some other module that depends on new version, the type of object is actually different, even though they have the same name, because they are from a different version, which means you need to make the runtime most likely treat them as such, so an interface that is @3.5 version can’t be derived by a type that is @2.6 for example. So you need a form of strong encapsulation, you can’t pass a type of an old version to another module or the main app, because they might not be able to use it if they depend on a newer or older version of it instead. So basically you can’t have leaky abstractions or they’ll start leaking possibly incompatible versions. This can get tricky to identify what won’t work at runtime.

To put it into perspective, say you depend on core.async, Channels are going to be something you now rely on to move data around. If you use a newer version of core.async, and for some reasons the Channel in that version are not compatible with Channels in the old. Now if you use a lib that depends on an older core.async, and the function in that lib takes a Channel@0.4 and returns a Channel@0.4, but you have a Channel@1.2, you might get a runtime exception when calling it.

I don’t keep replying because I disagree or am stubborn, but because you make a claim that attracts my attention, and I really want to understand what is so great about MoarVM modules system, what does it do differently, and how is it better.

That’s why I keep asking, okay, in what ways is it better? How does it do things differently?

They’re not designed to install libs of different languages for use by other languages. But you can in fact use Maven to package dependencies that you depend on from other languages. Some jars will contain compiled C binaries sometimes when needed for example. But yes, it’s not going to install those following the convention of whatever distro you’re using, it will be its own thing again, only for the purpose of exposing it on the LD_LIBRARY_PATH of Java so your application has access to it.

In other cases, the JVM will even run other languages, like how JPython and JRuby can run Java and Python code without needing to install Ruby or Python.

Finally, for these type of use cases, you’re right, something like Nix, Guix or containers like Dockers are much better, if you need a set of multi-language and complicated resources and setup to be installed for some app to work for example. But then you’re bypassing Gentoo again.

catdog · March 19, 2022, 5:31am

Because you have to experience and be shown. You can’t understand something by just reading some words. You are trying to see colors as a blind person.

It’s particularly easy to integrate raku module system with linux packaging systems. I integrated raku module system with gentoo package system very quickly with very little knowledge of raku language.

The integration was a lot simpler than integrating Go, Rust, C, C++, python, and even perl with linux packaging system. I wrote gentoo linux packages for Go programs, Python programs, Rust programs, Haskell programs, C/C++ programs, janet programs, and programs written in other languages. Raku offers a simple but powerful interface for integration with packaging systems. Raku language itself is respectable, too. In my direct experiences, raku and janet offer the simplest interface for integration with packaging systems. But, janet’s module system is a bit too primitive.

On the other hand, java packaging system for gentoo linux is complex and fragile. It makes my head hurt to even try to understand java packaging system on gentoo linux.

As I wrote, you have to experience and be shown before being able to grasp what I’m writing here.

The convention in raku ecosystem is to just specify minimum version requirement. Specifying lower version limit and upper version limit is a convention in haskell ecosystem. Specifying exact versions is a requirement in Go and Rust.

Most of the time, there will only be one version of each module except for a few things.

Write raku modules. Try to create linux packages for raku modules. You will understand.

I have used gentoo linux and other linux distributions. I plan to try GNU Guix System. I try as much as possible to respect each operating system’s packaging system. When I use arch linux, I respect pacman. When I use GNU Guix System, I respect GNU Guix.

I respect packaging systems so much I create packages for operating systems.

pieterbreed · March 19, 2022, 8:24am

Maybe some examples are in order.

catdog · March 20, 2022, 2:25am

I’m not sure that I want to give examples because I don’t know whether you are genuinely interested or are prompting me to give answers so that you can defend JVM. I can already imagine you saying “But, JVM does this or that… Why is JVM any worse?”

I can say from my experiences that JVM and javscript platforms don’t try to integrate with operating system packaging standards.

Perl was designed from the ground up to integrate with POSIX operating system packages. Raku integrates with them even better.

Unless you actually try to create linux packages for programs written in various languages, you aren’t going to truly grasp what I’m writing here.

I’m done trying to convince people here. I’ve made my decision to migrate to MoarVM ecosystem. I’m not going to miss JVM. I don’t have JVM on my system. MoarVM has everything I need for system applications.

Richard_Heller · March 21, 2022, 10:56pm

what, exactly, did you expect the people here to do? they’re not the maintainers of clojure. and yet, you’re trying to convince them to port it to a VM whose creators have explicitly stated that it’s not intended to be a target VM for languages other than raku. and your sole argument is that it’ll make it easier to deploy to one, specific OS? so, what were you trying to accomplish here?

catdog · March 22, 2022, 4:39am

I haven’t tried to actually convince people to do anything. I was just offering people a fun mind puzzle. You have not any capacity to imagine clojure on other platforms without actually doing anything? Why can’t you just have fun thinking about different possibilities?

If I wanted people to do anything, it’s to look into Raku and MoarVM. I’m learning Raku, and it’s a cool language.

I don’t think that’s not what the creators of MoarVM said. If it wasn’t intended to be a target for other languages, there is no need for clear separation between moarvm and raku. Any language targeting MoarVM would have to be written in NQP and observe raku’s core semantics. It’s a raku-centric VM that welcomes languages that embrace raku’s core semantics. JVM is similar. Any language targeting JVM has to embrace java’s core semantics. All virtual machines that I know have one main language.

No, my argument is that raku integrates very well with all POSIX or UNIX-like operating systems.

pieterbreed · March 22, 2022, 11:03am

Not correct. Examples:

JRuby
Jython
clojure
kotlin
groovy

I’ve seen at least two projects for perl implimentations on the JVM. They’re not maintained, but I think it prooves the point that the JVM is extremely tolerant for semantics different from java’s.

Dude, what?! I’ve encountered gentoo’s reluctance to support JVM projects and interested in understanding why this is the case. Example would support the point you are trying to make and create shared understanding.

As for “defending JVM”; JVM is capable and a realistic technical platform for software projects. If I as a developer wanted to target gentoo and I had to choose between porting clojure to another VM or making a gentoo package for a jar, I’d choose the latter. Learning about the issues from somebody more knowledgable (maybe like yourself) is of benefit to me and the wider community.