Wednesday, April 17, 2019

What is ZIL anyway?

The Infocom ZIL code dump has kicked off a small whirlwind of news articles and blog posts. A lot of them are somewhat hazy on what ZIL is, and how it relates to MDL, Lisp, Z-code, Inform, and the rest of the Golden-Age IF ecosystem.
So I'm going to talk a lot about it! With examples. But let's go through in chronological order.
The first version of Zork was MDL Zork. This is what Tim Anderson, Marc Blank, Bruce Daniels, and Dave Lebling wrote as MIT hackers around 1977 to 1979. MDL, the MIT Design Language, was a Lisp-like functional language created at MIT. MDL ran on the PDP-10, so that's where this ur-Zork ran.
Zork was ported to Fortran by Bob Supnik in 1980, and then to C. These versions, generally known as "mainframe Zork" (or "Dungeon") circulated among DEC user groups, wasted years of mainframe user time, and -- along with "mainframe Colossal Cave" -- changed the lives of many. (Including me, age 9 or so.)
At the same time, those MIT hackers formed a company and set out to... well, to make business software, if you look at those early plans. But they figured they'd first make a quick buck by porting Zork to home computers. However, MDL programs couldn't possibly run on an Apple 2 or a TRS-80. So the Infocom folks sat down and designed the Z-machine.
I'm not going to go through this story in depth. For that, I recommend Jimmy Maher's overview at the Digital Antiquarian. Here's the quickie: Infocom would write a game in ZIL, or "Zork Implementation Language" -- a high-level language derived from MDL. A compiler called ZILCH would then compile the game into a binary format.
The binary format, Z-code, was a program for an imaginary computer called the Z-machine. (Today we would say "virtual machine".) Nobody intended to build a Z-machine. But they could write a program that emulated the Z-machine. This program, called ZIP, was compact enough to run on a 16-bit home computer. So Infocom could distribute ZIP and the Z-code file on a floppy disk (or cassette tape!) and thus sell a playable game.
ZIL is not a mystery. We haven't had a lot of ZIL code available before this week. But someone scanned an Infocom ZIL manual years ago. You can read that manual here. (It's dated 1989, and primarily written by Steve Meretzky -- "SEM".)
But what kind of language is ZIL?
Above I said it was "derived from MDL". This is no surprise; the Infocom people wanted to reuse as much of MDL Zork as possible in their new commercial product. The resemblance is obvious, and indeed some of ZIL Zork was nearly identical to MDL Zork. Here's a function from the combat implementation in MDL Zork:
<DEFINE VILLAIN-STRENGTH (VILLAIN
                      "AUX"  (OD <OSTRENGTH .VILLAIN>) WV)
    #DECL ((VILLAIN) OBJECT (WV) <OR FALSE VECTOR>
           (OD VALUE) FIX)
    <COND (<G=? .OD 0>
           <COND (<AND <==? .VILLAIN <SFIND-OBJ "THIEF">>
                       ,THIEF-ENGROSSED!-FLAG>
                  <SET OD <MIN .OD 2>>
                  <SETG THIEF-ENGROSSED!-FLAG <>>)>
           <COND (<AND <NOT <EMPTY? <PRSI>>>
                       <TRNN <PRSI> ,WEAPONBIT>
                       <SET WV <MEMQ .VILLAIN ,BEST-WEAPONS>>
                       <==? <2 .WV> <PRSI>>>
                  <SET OD <MAX 1 <- .OD <3 .WV>>>>)>)>
    .OD>
And here's the same code from the ZIL version:
<ROUTINE VILLAIN-STRENGTH (OO
                       "AUX" (VILLAIN <GET .OO ,V-VILLAIN>)
                       OD TMP)
     <SET OD <GETP .VILLAIN ,P?STRENGTH>>
     <COND (<NOT <L? .OD 0>>
            <COND (<AND <EQUAL? .VILLAIN ,THIEF> ,THIEF-ENGROSSED>
                   <COND (<G? .OD 2> <SET OD 2>)>
                   <SETG THIEF-ENGROSSED <>>)>
            <COND (<AND ,PRSI
                        <FSET? ,PRSI ,WEAPONBIT>
                        <EQUAL? <GET .OO ,V-BEST> ,PRSI>>
                   <SET TMP <- .OD <GET .OO ,V-BEST-ADV>>>
                   <COND (<L? .TMP 1> <SET TMP 1>)>
                   <SET OD .TMP>)>)>
     .OD>
Pretty similar, right? But under the surface, rather different things are going on.
MDL is a functional language in the Lisp vein. Pretty much everything boils down to linked lists. Executing a program means freely constructing and throwing away lists, so there must be a garbage collector behind the scenes. The MDL compiler does what it can to eliminate memory allocation; efficient MDL code might be compiled to static machine code. But if the compiler can't do that, you wind up allocating stuff on the heap.
But the Z-machine doesn't have a garbage collector. It has no primitive operations for constructing lists or allocating objects. There are no Z-machine instructions for finding the head and tail of a list. (The famous car and cdr primitives of Lisp, called 1 and REST in MDL. The Z-machine don't have 'em.)
(You may know that I wrote a small Lisp interpreter for the Z-machine. It was fun, but the Z-machine gave me no help at all! I had to build the Lisp heap and list data structures myself, out of primitive Z-machine byte arrays. Same with the garbage collector. It's all terribly inefficient and janky.)
So how does ZIL perform these operations? Answer: it doesn't! As far as I can tell, the ZIL Zork code doesn't use Lisp-style (linked) lists at all. The 1, NTH, and REST functions appear only rarely, and I believe they always apply to static strings or arrays. MAPF, a basic Lisp tool which transforms one list into another, doesn't appear at all.
In contrast, the MDL source is filled with 1, NTH, REST, MAPF, and other functional-language constructs.
So in one sense, ZIL is a completely different language from MDL. It's a C-like compiled language which operates entirely on fixed data structures. But in another sense, as you can see from the examples above, they're almost identical! How does this make sense?
My answer is that game logic is a fairly narrow sort of programming. The combat example sets some local variables and looks up some object properties (fixed data structures!). It compares numbers; it compares variables to objects. And there's a bunch of "if" statements with "and"s and "or"s. Familiar, right? There are dozens of ad-hoc game scripting languages with similar features. You don't use a language like that to solve graph-theory problems, much less build a compiler.
(The Ancient Terror puzzle in Enchanter does involve some graph theory, but this is implemented with some rather clunky nested loops. Not a Lisp-y solution at all.)
It's not that functional methodology is impossible in ZIL. The language clearly imports as much of MDL's functional model as it can. It's just that this isn't very much. ZIL has about as much of MDL as can be compiled into static (non-allocating) code. Which makes sense -- that's exactly what the ZILCH compiler did.
To bring the story up to the present... In 1993, Graham Nelson released Inform. This language (and compiler) had exactly the same goal as ZIL: to efficiently compile source code into Z-machine files. Graham didn't have access to any information about ZIL at that point, so he just made up his own language, which was mostly like C because that's what he was familiar with.
(Of course C itself was designed to run efficiently on fixed data structures. So Graham had an easier job, in some sense. Not to take anything away from his accomplishment!)
Inform (up through version 6) became the cornerstone of the Z-machine ecosystem. But the Z-machine had some hard limits; it had been designed for 16-bit microcomputers and could not easily be expanded. So by 1997 there was a clear need for a new virtual machine. This is where I come into the picture. I designed Glulx as a replacement VM.
Amusingly (or ironically, or inevitably), Glulx had one core goal: to efficiently compile Inform 6 code to a new virtual machine. Every line of I6 code had to work essentially the same as it had before. Glulx used 32-bit values instead of 16-bit values, and I reorganized the memory layout and the instruction table. But the core data structures looked very much like those of the Z-machine.
As a result, it should be possible to compile ZIL to Glulx! I don't think anybody's done it. But Glulx was shaped by I6, I6 was shaped by the Z-machine, and the Z-machine and ZIL were shaped by each other. The chain of influence extends all the way from Joel Berez's coffee table to mine.
Inform 6 was followed by Inform 7, a completely new language which compiles to Inform 6 source code. At least for now. Future versions may compile to a new abstraction. But that's a story for another time.
Footnote: my conclusions about ZIL have to be qualified: I could be wrong. I don't know ZIL or MDL, really. I have an MDL programming guide open as I write this...

EDIT-ADD:
Given the MDL and ZIL code examples above, I thought it was worth adding the Inform 6 equivalent. This is from Allen Garvin's hand-polished I6 port of the ZIL code.
[ VillainStrength oo villain od tmp ;
    villain = oo-->0;
    od = villain.strength;
    if (od >= 0) {
        if (villain == thief && Thief_engrossed) {
            if (od > 2) {
                od = 2;
            }
            Thief_engrossed = false;
        }
        if (second && second has weapon && oo-->1 == second) {
            tmp = od - oo-->2;
            if (tmp < 1) {
                tmp = 1;
            }
            od = tmp;
        }
    }
    return od;
];
If you check, this is line-by-line equivalent to the ZIL version above. But it's much more readable to modern eyes, isn't it?
To some degree, this is just because I6 is part of the C-derived family of languages. (I'm told that "Algol family" is a better term, but I've never touched Algol.) This includes C, C++, C#, Javascript, Perl, Swift... very different languages, but they all share the basic assumption that if (X) print(Y) is a sensible way to write code. A programmer these days might never have used any other kind of language.
I6 follows C very closely, in this example. The only quirks are:
  • the --> operator (for arrays, where C would have oo[0], oo[1], oo[2]);
  • the has operator (for testing object flags; in ZIL this was FSET?);
  • defining functions in square brackets, which is idiosyncratic.
Everything else is expressed in ways that look completely natural. Mind you, the lower-case text helps a lot. But we're very used to code where identifiers are text, operators are (mostly) punctuation, and one line of code is conventionally one abstract step of procedure.
In 1979, for people brought up on Lisp, MDL probably wasn't as strange. No doubt they would say that it's much simpler to put the operator at the beginning of every call (<SET OD .TMP> rather than od = tmp;). Test operators always end with ?, whereas in the I6 code you have to look for an if to figure out whether you're performing a test. The difference between period and comma as atom prefixes is no doubt a concise way to express something (which I haven't bothered to look up what it is). And so on.
I still think the lower-case code is easier on the eyes, but hey.

Tuesday, April 16, 2019

All of Infocom's game source code

I can just leave that statement there, right? I don't have to elaborate? Jason describes it better than I could, anyhow:
So, Infocom source code is now uploaded to Github. Most people don't speak or want to speak the language it's written in, ZIL (Zork Implementation Language). You can browse through it and kind of suss out what's being done when and the choices made over the course of time.
In cases where the source code had multiple revisions, and I don't know the story of what revisions came when and came why, I did a reasonable job of layering them out (this came before that, that came after that) and doing multiple "check-ins" of the code so you can see diffs.
Often, there are cases that some games were built up from a previous game, allowing modification of the macros and structures and then making them work in the new game. For example, an NPC partygoer in one game was a thief in a previous one. Dungeons become stores, etc.
--@textfiles (Jason Scott), from a twitter thread
This material has been kicking around for a while now. If you search for articles about "the Infocom drive", you'll see some discussion from years past. Actually, don't do that, it's mostly old arguments that don't need to be rehashed.
The point is that a great deal of historical information about Infocom has been preserved -- but it's not publicly archived. You can't go research it anywhere. Nobody admits to having it, because it's "proprietary IP", and you're not supposed to trade in that stuff because companies like Activision make the rules.
So when Jason puts this information online, he's taking a stance. The stance is: history matters. Copyright is a balance between the rights of the owner to profit and the rights of the public to investigate, discuss, and increase the sphere of culture. Sometimes the balance needs a kick.
Quite possibly all these repositories will be served with takedown requests tomorrow. I'm downloading local copies for myself tonight, just in case.
One other note: Jason's comments say "...there is currently no known way to compile the source code in this repository into a final "Z-machine Interpreter Program" (ZIP) file." This is somewhat out of date. There are long-standing open-source efforts to build a ZIL compiler. ZILF is the most advanced one that I know of. I don't know whether it's rated to compile this historical ZIL code -- but I'm sure that people are already giving it a shot.