Introducing calcit-js: toy language inspired by cljs

Although I don’t use parentheses in calcit, it’s still mostly inspired by ClojureScript. The project is called calcit-runner since at first it was only designed be an interpreter, now it also emits JavaScript.

Some demos

To give a demo, there’s a linux binary called cr_once released at http://bin.calcit-lang.org/ , which can be run like:

=>> cr_once -e:"range 10"
Running calcit runner(0.2.34) in CI mode
(0 1 2 3 4 5 6 7 8 9)
=>> cr_once -e:"->> (range 4) $ map $ fn (x) (* x x)"
Running calcit runner(0.2.34) in CI mode
(0 1 4 9)

you may find functions and macros familiar if you see the docs http://apis.calcit-lang.org/ . And this docs itself is built with calcit-js, roughly 5x the cost comparing its ClojureScript version.

A more complicated demo is to run a script. In calcit, it’s using a snapshot file, rather “files of source”. Just one snapshot containing multiple namespaces of this package, and using indentation based syntax:

{} (:package |app)
  :configs $ {} (:init-fn |app.main/main!) (:reload-fn |app.main/reload!)
  :files $ {}
    |app.main $ {}
      :ns $ quote
        ns app.main $ :require
      :defs $ {}
        |main! $ quote
          defn main! ()
            println "\"Loaded program!"
            ; try-fibo
            echo $ sieve-primes ([] 2 3 5 7 11 13) 17 400

        |reload! $ quote
          defn reload! () nil

        |sieve-primes $ quote
          defn sieve-primes (acc n limit)
            if (&> n limit) acc $ if
              every?
                fn (m)
                  &> (mod n m) 0
                , acc
              recur (conj acc n) (inc n) (, limit)
              recur acc (inc n) limit
=>> cr_once fibo.cirru
Running calcit runner(0.2.34) in CI mode
Runner: specifying filesfibo.cirru
Calcit runner version: 0.2.34
Loaded program!
(2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 101 103 107 109 113 127 131 137 139 149 151 157 163 167 173 179 181 191 193 197 199 211 223 227 229 233 239 241 251 257 263 269 271 277 281 283 293 307 311 313 317 331 337 347 349 353 359 367 373 379 383 389 397)

Since Nim compiles to C for running, it boots quite fast:

=>> time cr_once fibo.cirru
Running calcit runner(0.2.34) in CI mode
Runner: specifying filesfibo.cirru
Calcit runner version: 0.2.34
Loaded program!
(2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 101 103 107 109 113 127 131 137 139 149 151 157 163 167 173 179 181 191 193 197 199 211 223 227 229 233 239 241 251 257 263 269 271 277 281 283 293 307 311 313 317 331 337 347 349 353 359 367 373 379 383 389 397)

real	0m0.122s
user	0m0.109s
sys	0m0.010s

By default it uses the file called compact.cirru which is emitted from calcit-editor.

And to emit js:

cr_once --emit-js fibo.cirru

The code generated from the command-line looks like(I formatted it with Pretter, manually):

import * as $calcit from "./calcit.core";

export let sieve_DASH_primes = $calcit.wrapTailCall(function sieve_DASH_primes(
  acc,
  n,
  limit
) {
  return $calcit._AND__GT_(n, limit)
    ? acc
    : $calcit.every_QUES_(function f_PCT_(m) {
        return $calcit._AND__GT_($calcit.mod(n, m), 0.0);
      }, acc)
    ? $calcit.recur($calcit.conj(acc, n), $calcit.inc(n), limit)
    : $calcit.recur(acc, $calcit.inc(n), limit);
});
export function main_BANG_() {
  $calcit.println("Loaded program!");
  /* (try-fibo) */ null;
  return $calcit.echo(
    sieve_DASH_primes(
      $calcit._LIST_(2.0, 3.0, 5.0, 7.0, 11.0, 13.0),
      17.0,
      400.0
    )
  );
}
export function reload_BANG_() {
  return null;
}

Also notice that an extra calcit.core.js is generated containing some core functions. And an extra package @calcit/procs is required to run the code.

Maybe you could be familiar with macros like ->>:

defmacro ->> (base & xs)
  if (empty? xs)
    quote-replace ~base
    &let
      x0 (first xs)
      if (list? x0)
        recur (append x0 base) & (rest xs)
        recur ([] x0 base) & (rest xs)

The binaries are not quite ready. I develop it with Macos and it probably requires some Nim and Webpack knowledge to run it on other platforms.

Motivations and calcit-editor

It’s called “calcit-runner” because it based on and also cooperated with my previous project Calcit Editor which was used to generate ClojureScript:

Calcit Editor uses calcit.cirru as snapshot file, and now supports emitting compact.cirru and .compact-inc.cirru as well. calcit-runner makes use of these 2 files. File file .compact-inc.cirru is for handling incremental changes.

While it’s cool to have ClojureScript with immutable data and macros, I do want to narrow the gap between “immutable data”-macros combo and JavaScript ecosystem. I learnt a lot about ClojureScript in the past 5 years, however it still bothers me in some aspects:

  • JVM compiler starts slowly and eats lots of memory of my laptops. thheller helped a lot but JVM has its own limitations.
  • ClojureScript has its own semantics because Clojure has the semantics, then we can’t change that for the reason that JavaScript has its own ecosystem, an always changing ecosystem.

I’m still using ClojureScript. Just want to try more possibilities. And I always thought in calcit.cirru I already have a lot of information of the program, why do I have to run it by compiling to another language first. So calcit-runner was the project to run Calcit Editor’s snapshot file.

Calcit Runner features

To make it kind of compatible to ClojureScript, I need some basic features:

  • macros, then I can generate syntax like in Clojure,
  • persistent data, which is at Clojure’s core semantics,
  • hot code replacement, for fast feedback loops,
  • frequently used APIs like in ClojureScript.

and being a JavaScript programmer there are still many features I can’ handle. So some major features you see in Clojure are missing in calcit-runner. It does not have a REPL, it does not support threads, no try/catch, no async programming. Or it’s currently more like a calculator.

However since macros are supported, emitting JavaScript is a lot easier after all syntax sugars are expanded.

The experience using calcit-runner is like something in between shadow-cljs and webpack, I modify the code, code got replaced in a browser(pure functions and atom states), an after-load function triggered, I run snippets in Chrome Console and use js debugger.

Meanwhile I can use just use calcit-runner to run code on top of Nim, save snapshot file again, it replaces code, and trigger after-load function as well, only got logs of error messages for debugging.

How calcit-js is implemented

Similar to Clojure, when calcit-runner is interpreting a syntax tree, it,

  • first, run preprocess to resolve symbols and expand macros,
  • then code are turned into core syntax with platform functions,
  • then platform function are called:

those phases can be seen in some of the files:

Roughly, preprocessing(resolving and macro expansion) and evaluating are 2 steps. So the preprocessing step can also be used to support code emitting.

An feature in calcit-runner is, after preprocessing, the information of the program can be exported into a JSON file A example of `cr --emit-ir` · GitHub . Only a demo though.

Meanwhile, the information is just ready for code generation, generating JavaScript in this case: calcit-runner/emit_js.nim at master · calcit-lang/calcit-runner · GitHub

Since syntaxes are de-sugared, its core is quite tiny. Although currently calcit-js lacks many features of ClojureScript or JavaScript, it’s still powerful enough for running a virtual DOM library to render simple pages, such as the APIs index above.

At current, calcit-js only support 2 simple rules of importing npm(or other js modules): :as and :refer, not renaming or handing .default. It just emits files with import/export syntax, relies on a extra bundler for running.

For tail recursions, calcit-js uses an extra function for handling Recur type from runtime, so does calcit-runner. It works well but comparing to code optimized from ClojureScript which uses while(...){...}, it could be a lot slower. calcit-js has bad compiler optimizations, it’s let is implemented with function wrappers so it’s far from optimized.

Expectation of calcit-js

Most time in my jobs I have to deal with npm packages and mostly with Webpack. Since React apps relies on both immutable data and packages from npm, I do want my virtual DOM library to integrate into JavaScript system well.

This is how I want to use calcit-js:

  • it compiles to js files with ES6 import/export syntax,
  • a bundle probably Webpack, loads files and bundles it with npm dependencies.
  • when a file changed, the bundler processes hot module replacement like a normal js file.

Besides Webpack, it works with Vite too, so I can debug code with code that’s less complicated, since browsers can load each *.mjs file without bundling.

It may also work in Node.js without bundling since Node.js can load modules in ES6 import/export syntax. It’s not quite smooth since it required .mjs extensions at current, and calcit-js need to handle those file extensions specifically.

The core functions of the runtime(except that calcit.core.js is emitted together with the program), are maintained with 2 npm packages so that bundler can just recognize:

At first I thought a language with immutable data and macros can hardly be compiled to js, but WISP did, Bucklescript did, that are proved to be possible. It’s just possible. ClojureScript has too many factors preventing it from integrating into JavaScript ecosystem comparing to other compile-to-js languages.

After all calcit-js only experimented a small area of integrating persistent data and macros into npm ecosystem. I’m fully expecting ClojureScript to find its way to connect with ES modules as well.(well, just please make JVM a little faster in doing that).

Optimizations of persistent data in calcit-js

During development of calcit-runner, there are several bottle necks of performance(besides my poor experience on system programming):

  • macro expanding
  • persistent data
  • variadic arguments

Macro expanding is still slow in my current implementation. I’m not going to discuss it since it involved too many factors.

For persistent data, I posted an explanation before(related to List and Map, not Set yet):

To make it work in calcit-js, I finally turned in into a TypeScript project and trying carefully to reduce unnecessary memory usages:

It’s not very optimal but since it’s designed to be a shared list, it has some benefits:

  • just a List, not a List and a Vector. I don’t need to use into [] to convert type in my program.
  • slice, assoc, dissoc operation salso shares parts of the tree, that’s cheaper.(bad part is get got slower.)

Some other issues are related to the “variadic arguments” feature. In Clojure, we have “variadic functions”. But in JavaScript, it’s using “arguments spreading”. And in calcit-runner, it’s more like JavaScript:

defn f1 (x0 & body)
  ; TODO x0 body

println (f1 1 2 3 4 5)

; or
def ys ([] 1 2 3 4 5)
println (f1 & ys))

While it may feels more natural to us JavaScript programmers(lack of static analysis though), it does bring performance costs. I located the costs in Nim implementation of calcit-runner. And it also brought barriers in using List in calcit-js for arguments spreading.

So to optimize that, I decided to implement CrDataList in two modes, one with Array and one with TernaryTreeList. When a CrDataList is first initialized, it’s an Array, so it’s fast for spreading. In Array mode, get is just JavaScript array accessing, that’s fast. And for slice, there’s virtual slicing by sharing same reference of original Array, so it’s cheap. It’s only turned into TernaryTreeList(which is a tree structure of persistent data) when it has to:

class CrDataList {
  value: TernaryTreeList<CrDataValue>;
  // array mode store bare array for performance
  arrayValue: Array<CrDataValue>;
  arrayMode: boolean;
  arrayStart: number;
  arrayEnd: number;
  cachedHash: Hash;
  constructor(value: Array<CrDataValue> | TernaryTreeList<CrDataValue>) {
    if (Array.isArray(value)) {
      this.arrayMode = true;
      this.arrayValue = value;
      this.arrayStart = 0;
      this.arrayEnd = value.length;
      this.value = null;
    } else {
      this.arrayMode = false;
      this.value = value;
      this.arrayValue = [];
      this.arrayStart = null;
      this.arrayEnd = null;
    }
  }
  turnListMode() {
    if (this.arrayMode) {
      this.value = initTernaryTreeList(
        this.arrayValue.slice(this.arrayStart, this.arrayEnd)
      );
      this.arrayValue = null;
      this.arrayStart = null;
      this.arrayEnd = null;
      this.arrayMode = false;
    }
  }
  len() {
    if (this.arrayMode) {
      return this.arrayEnd - this.arrayStart;
    } else {
      return listLen(this.value);
    }
  }
  get(idx: number) {
    if (this.arrayMode) {
      return this.arrayValue[this.arrayStart + idx];
    } else {
      return listGet(this.value, idx);
    }
  }
 // more....

A similar trick is also added to merge function. When two Maps are being merged, they are not merged into a single TernaryTreeList directly, but instead saved in a linked list, until they have to be merged:

class CrDataMap {
  cachedHash: Hash;
  chain: MapChain;
  depth: number;
  skipValue: CrDataValue;
  constructor(value: TernaryTreeMap<CrDataValue, CrDataValue>) {
    this.chain = { value: value, next: null };
    this.depth = 1;
    this.skipValue = fakeUniqueSymbol;
  }
  turnSingleMap() {
    if (this.depth === 1) {
      return;
    }
    // squash down to a single level of map
    let ret = this.chain.value;
    let cursor = this.chain.next;
    while (cursor != null) {
      if (!isMapEmpty(cursor.value)) {
        ret = ternaryTree.mergeSkip(cursor.value, ret, this.skipValue);
      }
      cursor = cursor.next;
    }
    this.chain = {
      value: ret,
      next: null,
    };
    this.depth = 1;
  }
  len() {
    this.turnSingleMap();
    return mapLen(this.chain.value);
  }
  get(k: CrDataValue) {
    let cursor = this.chain;
    while (cursor != null) {
      let v = mapGet(cursor.value, k);
      if (v != null && v !== this.skipValue) {
        return v;
      } else {
        cursor = cursor.next;
      }
    }
    return null;
  }

  merge(ys: CrDataMap) {
    return this.mergeSkip(ys, fakeUniqueSymbol);
  }
  mergeSkip(ys: CrDataMap, v: CrDataValue) {
    if (!(ys instanceof CrDataMap)) {
      throw new Error("Expected map");
    }

    let result = new CrDataMap(null);
    result.skipValue = v;
    ys.turnSingleMap();
    result.chain = {
      value: ys.chain.value,
      next: this.chain,
    };
    result.depth = this.depth + 1;
    if (result.depth > 5) {
      // 5 by experience, limit to squash linked list to value
      result.turnSingleMap();
    }
    return result;
  }
 // more code...

The trick for List made significant performance boost, especially for arguments spreading in JavaScript. The benefit of the trick in merge is unclear. Accessing values of Maps is slower in this way, so this might not be a good solution.

There are also some other tricks(even dirty ones) for hash function to reduce the cost of persistent map. However I’m understanding hashing and RRB tree well and so I still need to investigate into that.

Other thoughts…

As I said my main purpose is to run calcit-editor created programs without compiling to another language first, meanwhile, to use tools from JavaScript ecosystem. At least calcit-js accomplished these goals now, despite of its lack of features comparing to cljs.

A milestone of the project is I migrated Respo, which is my own virtual DOM libray, into calcit-js. I tried, the calcit-js one is roughly 5x slower compare to the optimized cljs one, it’s not very significant slowness:

My gains are more familiar tools from JavaScript side and faster boot time. Debugging the code generated from calcit-js is easier than from cljs since it has similar semantics with JavaScript, but meanwhile, it lacks of source maps. Being an experiment, I think it’s good enough.

For modules system, it uses a local folder ~/.config/calcit/modules to locate a different package. Very poor solution, no version resolutions.

It actually recompiles and then to know where a js file is changed(macros involved everywhere, hum?). So if one day my project gets large, it would probably be slower than cljs.

I’m not into C programming so I have difficulties porting TCP or other servers. That made the project rather limited, just no C extension to use. Also no VMs are well. Might be lots of considerations in future.

1 Like

In the past weeks I also migrated my own projects:

to calcit-js, and luckily they both work quite well.(Not use in larger apps though…)

Also I added cr --emit-ir option to let calcit-runner to emit a program-ir.json file, which is the program data after macro expanding, but before js code emitting. Meanwhile I built a ir-viewer page with Respo.calcit, as a debugging tool for the IR. So here you go:

Updates on calcit-js:

The interpreter/codegen has been rewritten in Rust for the power of ADTs. Performance is roughly same.

1 Like