Javascript wonkery

Monday, May 25, 2015

Comments: 11   (latest 8 days later)

Tagged: optimization, interactive fiction, if, quixe, javascript, interpreters


Here I will take a break from the ever-burbling stream of IF and Myst news, and talk about Javascript optimization.

You could say it's relevant because it came up in a Quixe bug report. ("I tried loading Emily Short's Counterfeit Monkey.... Load time dropped from 29.4 seconds to 10.4 seconds...") IF interpreter improvements are a high priority for me -- particularly if they could be big speed improvements for a fairly small code change. Double-particularly if they imply I've had crappy Javascript code out there for years.

Whoops.


I usually build my Javascript libraries according to the private namespace pattern. I can't remember where I learned this. I assume it originated in this blog post (Douglas Crockford), but I use the cleaner layout described here (Ben Cherry) or here (Todd Motto).

With this trick, all your internal functions and variables wind up hidden from the outside world. You decide which ones to export, and the rest remain private.

Here's a toy example:
Lib = function() {
    var counter = 0;
    var prefix = "The current value is ";

    function add(val) {
        counter += val;
        return counter;
    };

    function get() {
        return prefix + add(0);
    };

    return {
        add: add,
        get: get
    };
}();

Here counter and prefix are private variables. The add() function increases counter and returns it. The get() function returns it as part of a string. (I've set get() up to call add(0) just to demonstrate that private functions can call each other.) Finally, we return an object which exports the add and get symbols. This becomes the global Lib object, so a user can call Lib.add() and Lib.get(). But there is no Lib.counter; that's private.

Crockford says "This pattern of public, private, and privileged members is possible because JavaScript has closures... This is an extremely powerful property of the language." Okay, that's top-grade hand-waving. What he means is "Javascript is a terrible language, but it has stolen enough features from other languages that you can pull this crap off if you're incredibly careful and don't mind confusing onlookers."

Anyhow. The trick works pretty well until you start constructing Javascript functions on the fly. Let's extend our example:
Lib = function() {
    var counter = 0;
    var prefix = "The current value is ";

    var compiledfuncs = {};

    function add(val) {
        var func = compiledfuncs[val];
        if (func === undefined) {
            var text = "counter += " + val + ";";
            text += "return counter;";
            func = eval("(function func() { " + text + " })");
            compiledfuncs[val] = func;
        }
        return func();
    };

    function get() {
        return prefix + add(0);
    };

    return {
        add: add,
        get: get
    };
}();

What the heck is going on in add()? Imagine that this is an IF virtual machine interpreter, like Quixe. We're doing just-in-time (JIT) compilation. We take a Glulx function -- that is, a string of Glulx opcodes -- and convert it into a Javascript function. The browser's Javascript engine will then convert that function into native machine code and the result will run really fast.

Since this is a toy example, our only opcode is "increase counter by N". When we see a call to add(1), for example, we convert that into this Javascript function:
function func() {
    counter += 1;
    return counter;
}

We eval that source (compile it) and stash the function (in case another add(1) comes along). Then we execute it.

So that's great, and it works. But if you profile this in Chrome, for example, you'll see a couple of little yellow warning flags:
  • Not optimized: Function calls eval
  • Not optimized: Reference to a variable which requires dynamic lookup

You see, Javascript is a terrible language, and its scoping model is a haystack of ideas jammed together without any planning, and it can never be fixed because backwards compatibility. Certain operations in closures are legal, but slow.

(To be fair, every language's scoping model sucks. You know how the only hard problems are naming and cache invalidation? This is naming. But Javascript is worse than most.)

We could get rid of the eval() if we used a Function() constructor:
func = new Function(text);

But then the code would fail, because the function would exist outside the library closure. It wouldn't have access to the private variable counter.

I fussed around with a couple of different solutions. I tried inner and outer closures, I tried using this a couple different ways. None of them worked. (Crockford progresses from hand-waving to passive-aggressive sniping: "...an error in the ECMAScript Language Specification which causes this to be set incorrectly for inner functions." You tell 'em.)

I eventually settled on this:
Lib = function() {
    var self = {};
    self.counter = 0;

    var prefix = "The current value is ";

    var compiledfuncs = {};

    function add(val) {
        var func = compiledfuncs[val];
        if (func === undefined) {
            var text = "var self = this;";
            text += "self.counter += " + val + ";";
            text += "return self.counter;";
            func = new Function(text);
            func = func.bind(self);
            compiledfuncs[val] = func;
        }
        return func();
    };

    function get() {
        return prefix + add(0);
    };

    return {
        add: add,
        get: get
    };
}();

Our compiled function can't get access to the private namespace. We need a second namespace which can be handed in for that purpose. We'll call that self. We create self up top, and put the counter in it. We'll have to write self.counter to access it now.

To hand self in to our generated function, we call bind. This is a confusing Javascript feature which lets us glue any object in as the function's this. Then we compile it this way:
function func() {
    var self = this;
    self.counter += 1;
    return self.counter;
}

This is more verbose, but it works. And if we check it in Chrome's profiler, the yellow warning flags are gone.

Note that if we wanted the generated function to call private methods, we'd have to copy them into the self object too:
function get() {
    return prefix + add(0);
};
self.get = get;

Not too messy.

Now the real question is, does this actually make Quixe run faster? I don't know! I've started converting the code to this new model, but it's going to take more work. (The virtual machine has lots of state to shove into the self object.) The person who filed the original report got good results, but I'm not sure I've snagged the right tricks yet.

The toy example in this post seems to run a little slower with these changes. That's not encouraging, but of course it's not a realistic work model. Hopefully I'll have real results in a few days.


UPDATE, May 29th:

Got an updated Quixe branch working.

Turns out the func.bind(self) plan is terrible. Performance doesn't improve as much as it should; on Firefox it actually gets worse.

Instead, I've given each generated function an explicit self argument, and called them all as func(self). With this plan, Quixe is 65% faster in Safari, 55% faster in Chrome, 10% faster in Firefox. Woot!



Comments imported from Gameshelf