May 15, 2020

New Lazy Encoding in BuckleScript

Highlights of our newest changes to the internal representation of lazy values and how it will benefit our users.

Compiler & Build System

Important: This is an archived blog post, kept for historical reasons. Please note that this information might be outdated.

Introduction

Recently we made some significant improvements with our new encoding for lazy values, and we find it so exciting that we want to highlight the changes. The new encoding generates very idiomatic JS output like hand-written code.

For people who are not familiar with lazy evaluation, it is documented here.

Comparison between the old and new lazy encoding

Let's take an example, and see how the old encoding of a lazy value would look like:

RE
let lazy1 = lazy {
    "Hello, lazy" -> Js.log;
     1
}; // create a lazy value

let lazy2 = lazy 3 ; // artifical lazy values for demo purpose

Js.log2 (lazy1, lazy2); // logging the lazy values

let (lazy la, lazy lb) = (lazy1, lazy2); // pattern match to force evaluation

Js.log2 (la, lb); // logging forced values

When compiled and run in node, the runtime representation of our lazy values will look something like:

SH
lazy_demo$node src/lazy_demo.bs.js
[ [Function], tag: 246 ] 3 # logging the output of two lazy blocks
Hello, lazy # lazy1, laz2 evaluated forced by pattern match, hence logging
1 3 #logging the evaluated lazy block

With the new encoding, the output of the same example code would look like this:

SH
{ RE_LAZY_DONE: false, value: [Function: value] } { RE_LAZY_DONE: true, value: 3 } # logging block one with new encoding
Hello, lazy
1 3

As you can see, with the new encoding, no magic tags like 246 appear, and the lazy status is clearly marked via RE_LAZY_DONE: (true | false) .

In fact, the code quality of our generated bs.js files has also improved. Going back to our old version, the generated JS would look like this:

JS
var lazy1 = Caml_obj.caml_lazy_make((function (param) {
        console.log("Hello, lazy");
        return 1;
      }));

console.log(lazy1, 3);

var la = CamlinternalLazy.force(lazy1);

var lb = CamlinternalLazy.force(3);

console.log(la, lb);

var lazy2 = 3;

In our new version with all the new changes to the lazy encoding, the output is way more simplified:

JS
var lazy1 = {
  RE_LAZY_DONE: false,
  value: (function () { // closure now is uncurried arity-0 function
      console.log("Hello, lazy");
      return 1;
    })
};

var lazy2 = {
  RE_LAZY_DONE: true,
  value: 3
};

console.log(lazy1, lazy2);

var la = CamlinternalLazy.force(lazy1);

var lb = CamlinternalLazy.force(lazy2);

console.log(la, lb);

What changes did we make?

In the native runtime environment, the encoding of lazy values is rather complicated:

It is an array, which is not friendly for debugging in JS context.
It has some special tags which are not meaningful, for example, magic number 246, in JS context.
It tries to unbox lazy values with the help of the native garbage collector (GC). However, this behavior does not make sense in a JS runtime environment since the JSVM does not expose its GC semantics. Keeping that behavior would only introduce more complexity for the JS side.

So in our current master branch, we drastically simplified our lazy encoding scheme to optimize for the JS runtime as much as possible:

The encoding is uniform; it is always an object of two key value pairs. One is RE_LAZY_DONE to mark its status, the other is either a closure or an evaluated value.
The compiler optimization still kicks in at compile time: if it knows a lazy value is already evaluated or does not need to be evaluated, it will promote its status to be 'done'. However, unlike in a native environment, unboxing is not happening. This makes sense since the most interesting unboxing scenarios only happen during runtime and not during compile time (a scenario which is impossible in the JSVM).

With the new encoding, lazy is now way more viable for JS usage, so we encourage our users to use it whenever it is convenient!

Caveats:

Don't rely on the special name RE_LAZY_DONE for JS interop; we may change it to a symbol in the future.

Want to read more?

Back to Overview