Although I don’t use parentheses in calcit, it’s still mostly inspired by ClojureScript. The project is called calcit-runner since at first it was only designed be an interpreter, now it also emits JavaScript.
Some demos
To give a demo, there’s a linux binary called cr_once
released at http://bin.calcit-lang.org/ , which can be run like:
=>> cr_once -e:"range 10"
Running calcit runner(0.2.34) in CI mode
(0 1 2 3 4 5 6 7 8 9)
=>> cr_once -e:"->> (range 4) $ map $ fn (x) (* x x)"
Running calcit runner(0.2.34) in CI mode
(0 1 4 9)
you may find functions and macros familiar if you see the docs http://apis.calcit-lang.org/ . And this docs itself is built with calcit-js, roughly 5x the cost comparing its ClojureScript version.
A more complicated demo is to run a script. In calcit, it’s using a snapshot file, rather “files of source”. Just one snapshot containing multiple namespaces of this package, and using indentation based syntax:
{} (:package |app)
:configs $ {} (:init-fn |app.main/main!) (:reload-fn |app.main/reload!)
:files $ {}
|app.main $ {}
:ns $ quote
ns app.main $ :require
:defs $ {}
|main! $ quote
defn main! ()
println "\"Loaded program!"
; try-fibo
echo $ sieve-primes ([] 2 3 5 7 11 13) 17 400
|reload! $ quote
defn reload! () nil
|sieve-primes $ quote
defn sieve-primes (acc n limit)
if (&> n limit) acc $ if
every?
fn (m)
&> (mod n m) 0
, acc
recur (conj acc n) (inc n) (, limit)
recur acc (inc n) limit
=>> cr_once fibo.cirru
Running calcit runner(0.2.34) in CI mode
Runner: specifying filesfibo.cirru
Calcit runner version: 0.2.34
Loaded program!
(2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 101 103 107 109 113 127 131 137 139 149 151 157 163 167 173 179 181 191 193 197 199 211 223 227 229 233 239 241 251 257 263 269 271 277 281 283 293 307 311 313 317 331 337 347 349 353 359 367 373 379 383 389 397)
Since Nim compiles to C for running, it boots quite fast:
=>> time cr_once fibo.cirru
Running calcit runner(0.2.34) in CI mode
Runner: specifying filesfibo.cirru
Calcit runner version: 0.2.34
Loaded program!
(2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 101 103 107 109 113 127 131 137 139 149 151 157 163 167 173 179 181 191 193 197 199 211 223 227 229 233 239 241 251 257 263 269 271 277 281 283 293 307 311 313 317 331 337 347 349 353 359 367 373 379 383 389 397)
real 0m0.122s
user 0m0.109s
sys 0m0.010s
By default it uses the file called compact.cirru
which is emitted from calcit-editor.
And to emit js:
cr_once --emit-js fibo.cirru
The code generated from the command-line looks like(I formatted it with Pretter, manually):
import * as $calcit from "./calcit.core";
export let sieve_DASH_primes = $calcit.wrapTailCall(function sieve_DASH_primes(
acc,
n,
limit
) {
return $calcit._AND__GT_(n, limit)
? acc
: $calcit.every_QUES_(function f_PCT_(m) {
return $calcit._AND__GT_($calcit.mod(n, m), 0.0);
}, acc)
? $calcit.recur($calcit.conj(acc, n), $calcit.inc(n), limit)
: $calcit.recur(acc, $calcit.inc(n), limit);
});
export function main_BANG_() {
$calcit.println("Loaded program!");
/* (try-fibo) */ null;
return $calcit.echo(
sieve_DASH_primes(
$calcit._LIST_(2.0, 3.0, 5.0, 7.0, 11.0, 13.0),
17.0,
400.0
)
);
}
export function reload_BANG_() {
return null;
}
Also notice that an extra calcit.core.js
is generated containing some core functions. And an extra package @calcit/procs
is required to run the code.
Maybe you could be familiar with macros like ->>
:
defmacro ->> (base & xs)
if (empty? xs)
quote-replace ~base
&let
x0 (first xs)
if (list? x0)
recur (append x0 base) & (rest xs)
recur ([] x0 base) & (rest xs)
The binaries are not quite ready. I develop it with Macos and it probably requires some Nim and Webpack knowledge to run it on other platforms.
Motivations and calcit-editor
It’s called “calcit-runner” because it based on and also cooperated with my previous project Calcit Editor which was used to generate ClojureScript:
Calcit Editor uses calcit.cirru
as snapshot file, and now supports emitting compact.cirru
and .compact-inc.cirru
as well. calcit-runner makes use of these 2 files. File file .compact-inc.cirru
is for handling incremental changes.
While it’s cool to have ClojureScript with immutable data and macros, I do want to narrow the gap between “immutable data”-macros combo and JavaScript ecosystem. I learnt a lot about ClojureScript in the past 5 years, however it still bothers me in some aspects:
- JVM compiler starts slowly and eats lots of memory of my laptops. thheller helped a lot but JVM has its own limitations.
- ClojureScript has its own semantics because Clojure has the semantics, then we can’t change that for the reason that JavaScript has its own ecosystem, an always changing ecosystem.
I’m still using ClojureScript. Just want to try more possibilities. And I always thought in calcit.cirru
I already have a lot of information of the program, why do I have to run it by compiling to another language first. So calcit-runner
was the project to run Calcit Editor’s snapshot file.
Calcit Runner features
To make it kind of compatible to ClojureScript, I need some basic features:
- macros, then I can generate syntax like in Clojure,
- persistent data, which is at Clojure’s core semantics,
- hot code replacement, for fast feedback loops,
- frequently used APIs like in ClojureScript.
and being a JavaScript programmer there are still many features I can’ handle. So some major features you see in Clojure are missing in calcit-runner. It does not have a REPL, it does not support threads, no try/catch, no async programming. Or it’s currently more like a calculator.
However since macros are supported, emitting JavaScript is a lot easier after all syntax sugars are expanded.
The experience using calcit-runner is like something in between shadow-cljs and webpack, I modify the code, code got replaced in a browser(pure functions and atom states), an after-load
function triggered, I run snippets in Chrome Console and use js debugger.
Meanwhile I can use just use calcit-runner to run code on top of Nim, save snapshot file again, it replaces code, and trigger after-load
function as well, only got logs of error messages for debugging.
How calcit-js is implemented
Similar to Clojure, when calcit-runner is interpreting a syntax tree, it,
- first, run preprocess to resolve symbols and expand macros,
- then code are turned into core syntax with platform functions,
- then platform function are called:
those phases can be seen in some of the files:
- Core macros and aliases: calcit-runner.nim/src/includes/calcit-core.cirru at master · calcit-lang/calcit-runner.nim · GitHub
- Syntaxes: calcit-runner.nim/src/calcit_runner/core_syntax.nim at master · calcit-lang/calcit-runner.nim · GitHub
- Functions: calcit-runner.nim/src/calcit_runner/core_func.nim at master · calcit-lang/calcit-runner.nim · GitHub
Roughly, preprocessing(resolving and macro expansion) and evaluating are 2 steps. So the preprocessing step can also be used to support code emitting.
An feature in calcit-runner is, after preprocessing, the information of the program can be exported into a JSON file https://gist.github.com/jiyinyiyong/b15ab9fa4467b8f8e09b5a291c84f655 . Only a demo though.
Meanwhile, the information is just ready for code generation, generating JavaScript in this case: calcit-runner.nim/src/calcit_runner/codegen/emit_js.nim at master · calcit-lang/calcit-runner.nim · GitHub
Since syntaxes are de-sugared, its core is quite tiny. Although currently calcit-js lacks many features of ClojureScript or JavaScript, it’s still powerful enough for running a virtual DOM library to render simple pages, such as the APIs index above.
At current, calcit-js only support 2 simple rules of importing npm(or other js modules): :as
and :refer
, not renaming or handing .default
. It just emits files with import/export syntax, relies on a extra bundler for running.
For tail recursions, calcit-js uses an extra function for handling Recur
type from runtime, so does calcit-runner. It works well but comparing to code optimized from ClojureScript which uses while(...){...}
, it could be a lot slower. calcit-js has bad compiler optimizations, it’s let
is implemented with function wrappers so it’s far from optimized.
Expectation of calcit-js
Most time in my jobs I have to deal with npm packages and mostly with Webpack. Since React apps relies on both immutable data and packages from npm, I do want my virtual DOM library to integrate into JavaScript system well.
This is how I want to use calcit-js:
- it compiles to js files with ES6 import/export syntax,
- a bundle probably Webpack, loads files and bundles it with npm dependencies.
- when a file changed, the bundler processes hot module replacement like a normal js file.
Besides Webpack, it works with Vite too, so I can debug code with code that’s less complicated, since browsers can load each *.mjs
file without bundling.
It may also work in Node.js without bundling since Node.js can load modules in ES6 import/export syntax. It’s not quite smooth since it required .mjs
extensions at current, and calcit-js need to handle those file extensions specifically.
The core functions of the runtime(except that calcit.core.js
is emitted together with the program), are maintained with 2 npm packages so that bundler can just recognize:
At first I thought a language with immutable data and macros can hardly be compiled to js, but WISP did, Bucklescript did, that are proved to be possible. It’s just possible. ClojureScript has too many factors preventing it from integrating into JavaScript ecosystem comparing to other compile-to-js languages.
After all calcit-js only experimented a small area of integrating persistent data and macros into npm ecosystem. I’m fully expecting ClojureScript to find its way to connect with ES modules as well.(well, just please make JVM a little faster in doing that).
Optimizations of persistent data in calcit-js
During development of calcit-runner, there are several bottle necks of performance(besides my poor experience on system programming):
- macro expanding
- persistent data
- variadic arguments
Macro expanding is still slow in my current implementation. I’m not going to discuss it since it involved too many factors.
For persistent data, I posted an explanation before(related to List
and Map
, not Set
yet):
To make it work in calcit-js, I finally turned in into a TypeScript project and trying carefully to reduce unnecessary memory usages:
It’s not very optimal but since it’s designed to be a shared list, it has some benefits:
- just a
List
, not aList
and aVector
. I don’t need to useinto []
to convert type in my program. slice
,assoc
,dissoc
operation salso shares parts of the tree, that’s cheaper.(bad part isget
got slower.)
Some other issues are related to the “variadic arguments” feature. In Clojure, we have “variadic functions”. But in JavaScript, it’s using “arguments spreading”. And in calcit-runner, it’s more like JavaScript:
defn f1 (x0 & body)
; TODO x0 body
println (f1 1 2 3 4 5)
; or
def ys ([] 1 2 3 4 5)
println (f1 & ys))
While it may feels more natural to us JavaScript programmers(lack of static analysis though), it does bring performance costs. I located the costs in Nim implementation of calcit-runner. And it also brought barriers in using List
in calcit-js for arguments spreading.
So to optimize that, I decided to implement CrDataList
in two modes, one with Array
and one with TernaryTreeList
. When a CrDataList
is first initialized, it’s an Array
, so it’s fast for spreading. In Array
mode, get
is just JavaScript array accessing, that’s fast. And for slice
, there’s virtual slicing by sharing same reference of original Array
, so it’s cheap. It’s only turned into TernaryTreeList
(which is a tree structure of persistent data) when it has to:
class CrDataList {
value: TernaryTreeList<CrDataValue>;
// array mode store bare array for performance
arrayValue: Array<CrDataValue>;
arrayMode: boolean;
arrayStart: number;
arrayEnd: number;
cachedHash: Hash;
constructor(value: Array<CrDataValue> | TernaryTreeList<CrDataValue>) {
if (Array.isArray(value)) {
this.arrayMode = true;
this.arrayValue = value;
this.arrayStart = 0;
this.arrayEnd = value.length;
this.value = null;
} else {
this.arrayMode = false;
this.value = value;
this.arrayValue = [];
this.arrayStart = null;
this.arrayEnd = null;
}
}
turnListMode() {
if (this.arrayMode) {
this.value = initTernaryTreeList(
this.arrayValue.slice(this.arrayStart, this.arrayEnd)
);
this.arrayValue = null;
this.arrayStart = null;
this.arrayEnd = null;
this.arrayMode = false;
}
}
len() {
if (this.arrayMode) {
return this.arrayEnd - this.arrayStart;
} else {
return listLen(this.value);
}
}
get(idx: number) {
if (this.arrayMode) {
return this.arrayValue[this.arrayStart + idx];
} else {
return listGet(this.value, idx);
}
}
// more....
A similar trick is also added to merge
function. When two Maps are being merged, they are not merged into a single TernaryTreeList
directly, but instead saved in a linked list, until they have to be merged:
class CrDataMap {
cachedHash: Hash;
chain: MapChain;
depth: number;
skipValue: CrDataValue;
constructor(value: TernaryTreeMap<CrDataValue, CrDataValue>) {
this.chain = { value: value, next: null };
this.depth = 1;
this.skipValue = fakeUniqueSymbol;
}
turnSingleMap() {
if (this.depth === 1) {
return;
}
// squash down to a single level of map
let ret = this.chain.value;
let cursor = this.chain.next;
while (cursor != null) {
if (!isMapEmpty(cursor.value)) {
ret = ternaryTree.mergeSkip(cursor.value, ret, this.skipValue);
}
cursor = cursor.next;
}
this.chain = {
value: ret,
next: null,
};
this.depth = 1;
}
len() {
this.turnSingleMap();
return mapLen(this.chain.value);
}
get(k: CrDataValue) {
let cursor = this.chain;
while (cursor != null) {
let v = mapGet(cursor.value, k);
if (v != null && v !== this.skipValue) {
return v;
} else {
cursor = cursor.next;
}
}
return null;
}
merge(ys: CrDataMap) {
return this.mergeSkip(ys, fakeUniqueSymbol);
}
mergeSkip(ys: CrDataMap, v: CrDataValue) {
if (!(ys instanceof CrDataMap)) {
throw new Error("Expected map");
}
let result = new CrDataMap(null);
result.skipValue = v;
ys.turnSingleMap();
result.chain = {
value: ys.chain.value,
next: this.chain,
};
result.depth = this.depth + 1;
if (result.depth > 5) {
// 5 by experience, limit to squash linked list to value
result.turnSingleMap();
}
return result;
}
// more code...
The trick for List made significant performance boost, especially for arguments spreading in JavaScript. The benefit of the trick in merge
is unclear. Accessing values of Maps is slower in this way, so this might not be a good solution.
There are also some other tricks(even dirty ones) for hash function to reduce the cost of persistent map. However I’m understanding hashing and RRB tree well and so I still need to investigate into that.
Other thoughts…
As I said my main purpose is to run calcit-editor created programs without compiling to another language first, meanwhile, to use tools from JavaScript ecosystem. At least calcit-js accomplished these goals now, despite of its lack of features comparing to cljs.
A milestone of the project is I migrated Respo, which is my own virtual DOM libray, into calcit-js. I tried, the calcit-js one is roughly 5x slower compare to the optimized cljs one, it’s not very significant slowness:
My gains are more familiar tools from JavaScript side and faster boot time. Debugging the code generated from calcit-js is easier than from cljs since it has similar semantics with JavaScript, but meanwhile, it lacks of source maps. Being an experiment, I think it’s good enough.
For modules system, it uses a local folder ~/.config/calcit/modules
to locate a different package. Very poor solution, no version resolutions.
It actually recompiles and then to know where a js file is changed(macros involved everywhere, hum?). So if one day my project gets large, it would probably be slower than cljs.
I’m not into C programming so I have difficulties porting TCP or other servers. That made the project rather limited, just no C extension to use. Also no VMs are well. Might be lots of considerations in future.