We do but teach bloody instructionsWhich, being taught, return to plague th' inventor
Shakespeare, Macbeth
A single Z-machine instruction consists of the following sections (and in the order shown):
Opcode 1 or 2 bytes (Types of operands) 1 or 2 bytes: 4 or 8 2-bit fields Operands Between 0 and 8 of these: each 1 or 2 bytes (Store variable) 1 byte (Branch offset) 1 or 2 bytes (Text to print) An encoded string (of unlimited length)Bracketed sections are not present in all opcodes. (A few opcodes take both "store" and "branch".)
There are four 'types' of operand. These are often specified by a number stored in 2 binary digits:
$$00 Large constant (0 to 65535) 2 bytes $$01 Small constant (0 to 255) 1 byte $$10 Variable 1 byte $$11 Omitted altogether 0 bytes
4.2.1
Large constants, like all 2-byte words of data in the Z-machine, are stored with most significant byte first (e.g. $2478 is stored as $24 followed by $78). A 'large constant' may in fact be a small number.
4.2.2
Variable number $00 refers to the top of the stack, $01 to $0f mean the local variables of the current routine and $10 to $ff mean the global variables. It is illegal to refer to local variables which do not exist for the current routine (there may even be none).
4.2.3
The type 'Variable' really means "variable by value". Some instructions take as an operand a "variable by reference": for instance, inc has one operand, the reference number of a variable to increment. This operand usually has type 'Small constant' (and Inform automatically assembles a line like @inc turns by writing the operand turns as a small constant with value the reference number of the variable turns).
In long form the operand count is always 2OP. The opcode number is given in the bottom 5 bits.
In extended form, the operand count is VAR. The opcode number is given in a second opcode byte.
Next, the types of the operands are specified.
In short form, bits 4 and 5 of the opcode give the type.
Otherwise, a branch moves execution to the instruction at address
Address after branch data + Offset - 2.
The Inform assembler can assemble branches in either form, though the programmer should always use long form unless there's a good reason. Inform automatically optimises branch statements so as to force as many of them as possible into short form. (This optimisation will happen to branches written by hand in assembler as well as to branches compiled by Inform.)
The disassembler Txd numbers locals from 0 to 14 and globals from 0 to 239 in its output (corresponding to variable numbers 1 to 15, and 16 to 255, respectively).
The branch formula is sensible because in the natural implementation, the program counter is at the address after the branch data when the branch takes place: thus it can be regarded as
PC = PC + Offset - 2.If the rule were simply "add the offset" then, since the offset couldn't be 0 or 1 (because of the return-false and return-true values), we would never be able to skip past a 1-byte instruction (say, a 0OP like quit), or specify the branch "don't branch at all" (sometimes useful to ignore the result of the test altogether). Subtracting 2 means that the only effects we can't achieve are
PC = PC - 1 and PC = PC - 2and we would never want these anyway, since they would put the program counter somewhere back inside the same instruction, with horrid consequences.
$00 -- $1f long 2OP small constant, small constant $20 -- $3f long 2OP small constant, variable $40 -- $5f long 2OP variable, small constant $60 -- $7f long 2OP variable, variable $80 -- $8f short 1OP large constant $90 -- $9f short 1OP small constant $a0 -- $af short 1OP variable $b0 -- $bf short 0OP except $be extended opcode given in next byte $c0 -- $df variable 2OP (operand types in next byte) $e0 -- $ff variable VAR (operand types in next byte(s))Here is an example disassembly:
@inc_chk c 0 label; 05 02 00 d4 long form; count 2OP; opcode number 5; operands: 02 small constant (referring to variable c) 00 small constant 0 branch if true: 1-byte offset, 20 (since label is 18 bytes forward from here). @print "Hello.^"; b2 11 aa 46 34 16 45 9c a5 short form; count 0OP. literal string, Z-chars: 4 13 10 17 17 20 5 18 5 7 5 5. @mul 1000 c -> sp; d6 2f 03 e8 02 00 variable form; count 2OP; opcode number 22; operands: 03 e8 long constant (1000 decimal) 02 variable c store result to stack pointer (var number 00). @call_1n Message; 8f 01 56 short form; count 1OP; opcode number 15; operand: 01 56 long constant (packed address of routine) .label;
Section 1 / 2 / 3 / 4 / 5 / 6 / 7 / 8 / 9 / 10 / 11 / 12 / 13 / 14 / 15 / 16
Appendix A / B / C / D / E / F