Should Linux distributions ship Clojure byte compiled (AOT) or not?

That wasn’t my question. All the things you listed are real issues and guix seems like a solid solution.

Thats exactly the point. All Clojure tools know this, not just Leiningen. So, guix is not required for this part, since that is already a solved problem. Which is why my question was specifically about libraries, not tools.

Because this is simply undesirable in my view. Relying on system installs of specific libraries leads to “works on my machine” type of problems. In Clojure the only expectation from your OS is that Java is present and your tool of choice, e.g. clj-tools. Beyond that the tooling takes care of the rest in a very reliable OS-independent way.

Absolutely useful to solve the clj-tools+Java and whatever other tool part via guix, no question there.

Taking libraries from their origin maven repository, AOT (or not) compiling them, putting them in some other registry is the questionable part. I honestly don’t see how you’d even do that. In the end you’d just be replicating what the tooling can already do, in a worse way leading to almost guaranteed out-of-date and/or incomplete mirror. CLJSJS is a good history lessen on how where that leads.

1 Like

This isn’t how Guix works, and I agree that would be very questionable if it did. Guix is a from-source distro, and primarily distributes package definitions. It builds everything from source, and never downloads from any Maven repository, because those contain the output of external build processes.

Restating the question:

It allows the complete environment needed to run a piece of software to be managed and replicated in a uniform way, with one tool. Value is subjective: if you want those features, it’s a lot of value, and if you don’t, it’s probably net negative. Both positions are valid. I encourage everyone to use the tools they prefer.

Exactly. Clojure tools solve the problem for Clojure. Python tools solve the problem for Python. NodeJS tools solve the problem for NodeJS. Ruby tools solve it for Ruby. Rust tools solve it for Rust. The platforms they target have no solution, so languages must bring their own.

Guix does have a general solution, so it’s idiomatic in Guix to have Guix manage everything, and an antipattern to use language-specific tooling to install software. Just as it’s idiomatic on other platforms to do the opposite.

This isn’t about requirements, but preferences. If you opt in to Guix, you accept the cost of packaging everything as worth it to get the benefit of uniform management of the complete environment with one tool, and the downstream features that enables.

This workflow is atypical on Guix, perhaps happening in the early stages of bringing up a new project. Whereas the typical Clojure developer will see dependencies downloaded and think, “well, that’s sorted then,” the typical Guix user will see that and think, “oh, I need to package those.”

Guix packages are built in isolated environments with no network access, so no tool downloads anything. Anything a Guix package needs must also be in a Guix package. It’s Guix packages all the way down.

Guix doesn’t really have the same notion of system installs in the way other OSs/distros do, so this type of problem largely doesn’t exist. Guix packages declare what they need to build and run (which can include specific versions), and they get exactly what they ask for, no more, no less. The needs of other packages are invisible to it, because they’re outside its scope.

“Works on my machine” problems are greatly diminished when you have one tool which manages the complete environment, as Guix does. Language-specific tooling doesn’t solve this, it requires a higher-level approach. That could be Ansible, or asdf, or containerizing, or using Guix. All valid approaches, and which you chooses depends on your preference.

Out of date versions is absolutely a problem. Guix should do better there. Zero argument on this point.

In any case, probably nobody here has either the desire or ability to change the broad strokes of how these ecosystems work, and that’s fine; I just wanted to give some context around why Guix “does it weird,” in the interest of increasing understanding and finding common ground.

Bringing this back around to the main topic:

  • I’m in favor of Guix shipping package definitions for Clojure libraries.
  • I agree Guix shouldn’t AOT Clojure libraries.
    • The Clojure compiler’s nondeterministic output seems like a compelling argument for why this should be disabled in the context of Guix package definitions.
    • I’m still curious about the REPL usecase for mixing AOT and non-AOT code.
  • I agree Guix should probably AOT things that are directly executable, like leiningen.
  • I’m also curious how Nix handles this (if it does at all). Guix is derived from Nix, so helpful context is often found in their approach.
1 Like

Given that Clojure itself can dynamically download and add new JVM dependencies while running the REPL (even beyond what any given project declares in its deps.edn file) – and that mechanism relies on the same local ~/.m2 cache setup that mvn uses – and that “Clojure libraries” is potentially everything on both Maven Central and Clojars – all versions, all libraries – I still don’t see how it is practical for Guix to even attempt to provide that?

1 Like

Ok, that sounds even worse. So, instead of working with a “standard package format”, you are going to hunt down X source repositories, figure out their build setups, create a custom build to match guix expectations, then hunt down their Y dependencies, figure out their build steps and repeat? Or worse anyone using any dependency has to do that themselves?

:person_shrugging:

Good luck. You are welcome to do all of that.

1 Like

We’re far afield here, so I’ll say: it’s not a goal for Guix to package every Clojure library, just as it’s not a goal to package every Python or Rust or C library. If you need something packaged, you package it.

The usecase of using tools.deps to develop software locally which isn’t packaged for Guix is distinct from the requirements of software which is packaged. In the former case, you can do whatever you want. In the latter, you must package your dependencies — the same packages for as any other Linux distro.

Your’re not really correct, but I think I’m done trying to explain. They’re different tools with different goals, which require different approaches to meet them, and both are valid. If you don’t like how you think Guix works, you don’t have to use it.

I think the net consensus here is that it’s fine for Guix to package the Clojure CLI tooling (as long as it is kept up to date per Clojure - Tools Releases) but it really makes no sense to package Clojure libraries – so the question of “AOT or not” can be limited to what’s in the Clojure CLI tooling itself.

At work, we vendor in a subset of the CLI installation (we omit the man page) so the AOT question applies to the contents of just two JAR files:

build/clojure/ # this just happens to be where we keep the CLI in our repo
├── bin
│   ├── clj
│   └── clojure
├── deps.edn
├── example-deps.edn
├── libexec
│   ├── clojure-tools-1.11.1.1435.jar
│   └── exec.jar
└── tools.edn

Of those two JAR files, exec.jar is a tiny library in source-only form and clojure-tools-*.jar is an “application” (what we in the Clojure community call an “uberjar”). The latter is a mix of AOT’d code and source code – but I think the source code that is included is “just” the source of the AOT’d code (because it’s fairly common in the Clojure world to still ship source in JAR files even when the compiled .class files are included).

The clojure-tools-*.jar file includes a specific version of Clojure itself, in compiled form, which matches the version stem of the tools JAR (so, Clojure 1.11.1 in this case), and that will match the version declared in the deps.edn file there (although, strictly speaking, it doesn’t have to).

I think that exec.jar must stay in source-only form since it is used with whatever version of Clojure the user declares they want to use – so you need to avoid any possible AOT version conflicts between that user-specified Clojure version and what you might be tempted to AOT the exec source with.

To package the Clojure CLI from source, you would need the source of Clojure itself and whatever libraries are shipped in the clojure-tools-*.jar file. Clojure (since 1.9 onward) depends on Spec (clojure/spec) and the Core Specs (clojure/core.specs.alpha: specs to describe Clojure core macros and functions (github.com)) – and the build process uses Maven mvn and assumes those dependencies would be downloaded from Maven Central (via mvn itself), so I’m not sure what that looks like in a Guix world where “building” happens in isolation from the Internet?

It sounds like you’d need to “package” those libraries (i.e., everything the clojure-tools-*.jar needed) but in order to AOT them, you’d need an existing Clojure version built already – which is a bit of a circular dependency. I assume the Guix folks have already tackled that since there’s an older version of the clj-tools already packaged?

1 Like

Agree about the limitations of the scope, but noting that this is something that’d have to change if other Clojure applications got packaged for Guix. There don’t seem to be many of those, so this could be a non-issue, but should someone want to package a Clojure application for Guix, it’s a problem that’d need to be solved. And I think it’s clear that, whatever approach is taken, that would mean not AOTing Guix-packaged Clojure libraries.

I wasn’t sure either, so I went and took a look. It looks like the CLI tooling (the clj and clojure wrapper scripts) isn’t packaged at all. I vaguely remember a mailing list discussion about this, but I’m not sure why that’s the case. The state of Clojure stuff in Guix definitely doesn’t seem like it’s where it needs to be right now — hence this discussion.

Clojure is bootstrapped by creating anonymous origins[1] for the source of the libraries it depends on, putting those into the clojure package’s inputs[2], and copying their source into the Clojure source tree. Then it patches the Ant build.xml to include them in the compile-clojure target’s args, which compiles everything and builds the JAR.

This strikes me as “not great,” but also, within the realm of hackery that seems to be necessary to bootstrap 100% from source, which is a hard requirement for everything in Guix. Other languages have much more complicated stories for this.

[1]: “Origin” is the Guix structure representing (usually) a source tarball downloaded from somewhere, along with its hash — plenty of downloading happens during Guix builds, but only via the Guix tooling, before the package’s build steps execute. “Anonymous origin” means they’re not part of a standalone, first-class package, and aren’t visible outside the clojure package build process.

[2]: “input” is mostly equivalent to a package dependency in other package managers. Guix packages can depend on raw source, output from other builds, or a mix of the two, as is the case here.

Agreed. (Open source) Clojure applications are fairly rare. Even most of the tooling out there that might look like applications are often intended to be run as part of a Clojure REPL or similar process and so they are also typically source-based libraries (as opposed to tooling in many other language stacks that actually are standalone applications).

Interesting… So, it essentially treats those “Clojure libraries” etc as local source code instead, if I’m understanding you correctly? (and it duplicates/mimics the dependency resolution upfront that would otherwise take place during the package build)

The last piece that I’m still having trouble understanding about the way Guix approaches the JVM ecosystem: given that Java developers typically rely on mvn or gradle or some specific, existing build system that does dependency resolution against a remote library repository, how does Guix packaging JVM libraries fit into that? Those tools know where to look (Maven Central by default, or specific repositories via configuration) – how would those standard build processes utilize Guix-packaged libraries at all?

1 Like

Yes, exactly.

This isn’t something I have deep knowledge about, so take it with a grain of salt; but the Guix documentation says that the maven-build-system creates a temporary local Maven repo and populates it with the JARs of the package’s dependencies, then runs Maven in offline mode to build artifact(s) for the package. It also has some facilities to remove or replace dependencies by rewriting pom.xml files, or to add Maven plugins. I’m not sure of the rationale for those; it looks like maybe only one package uses them.

Thanks for the good discussion! Also, I’m a big fan of next.jdbc, so thank you for that as well.

Sorry for the late reply, have been out of town. I don’t think Linux distributions should be recreating published jars, especially using a different build/assembly process - that has always seemed wrong and confusing to me. I would love to be doing more in the Clojure ecosystem to make published libraries verifiable via key infrastructure/provenance/signing, which should be impossible for anyone but the library creator to make (if it is to have any value).

Re AOT and the ABI compatibility etc, I do want to add some precision to this discussion. Clojure compilation is done by some version of the Clojure compiler, with access to that version of the Clojure core libs. The emitted bytecode will (for the last several Clojure versions) be Java 8 era bytecode and will contain calls into the Clojure runtime (particularly the Reflector, RT, and Util classes). This code cannot be run on JVM < version 8 (just like Java code compiled to Java 8 bytecode cannot run on Java 7).

Over time, as features are added to Clojure, there may be new methods added to the Clojure runtime called by the emitted code. We are disciplined about not removing or changing existing internal methods, so compiling with Clojure version X should produce bytecode that will run on Clojure version X+N. But an application may wish to run with an older Clojure runtime, and that may not work. Compiling libs with AOT necessarily makes a choice of minimum Clojure version for later application developers as they cannot run with an older version than was used at packaging time. There is no difference here from Java - jars compiled to a particular bytecode version cannot run on an older JVM. What is different is the ability to target a particular bytecode (and core libs now!) version from a newer Java compiler - having this support takes a lot of effort which the Java team has resources for, but we do not.

So to address the original question - the ABI is NOT “unstable”; it is “stable but evolves in additive ways”. This makes it backward-compatible (new runtime can run old binary), but not forward-compatible (old runtime cannot run new binary). This is not unusual. Java libs make these same choices when they decide what Java version to build with (both in bytecode version and in JDK library API calls). Clojure is more flexible in that it is a source-first language - precompiling all binaries takes that flexibility away.

6 Likes

Alex - thanks for commenting and adding precision to the discussion - appreciate a member of the Clojure core team taking the time:

Thank-you, that’s clear.

Understood - appreciate you clarifying my ‘rough’ wording! That’s perfectly clear.