home *** CD-ROM | disk | FTP | other *** search
- .SH
- Code Generation
- .PP
- The grand plan for code-generation is
- independent of any particular machine;
- it depends largely on a set of tables.
- But this fact does not necessarily make it very easy
- to modify the compiler to produce code for other machines,
- both because there is a good deal of machine-dependent structure
- in the tables, and because in any event such tables are non-trivial to
- prepare.
- .PP
- The arguments to the basic code generation routine
- .II rcexpr
- are a pointer to a tree representing an expression,
- the name of a code-generation table,
- and the number of a register in which the value of the
- expression should be placed.
- .II Rcexpr
- returns the number of the register in which the value actually
- ended up;
- its caller
- may need to produce a
- .II mov
- instruction if the value really needs to be in the given register.
- There are four code generation tables.
- .PP
- .II Regtab
- is the basic one, which actually does the job described
- above: namely,
- compile code which places the value represented by the expression
- tree in a register.
- .PP
- .II Cctab
- is used when the value of the expression is not actually needed,
- but instead the value of the condition codes resulting from
- evaluation of the expression.
- This table is used, for example, to evaluate the expression after
- .II if.
- It is clearly silly to
- calculate the value (0 or 1) of the expression
- `a==b' in the context `if (a==b) ... '
- .PP
- The
- .II sptab
- table is used when the value of an expression is to be pushed on the stack,
- for example when it is an actual argument.
- For example in the function call `f(a)' it is a bad idea to
- load
- .II a
- into a register which is then pushed on the stack,
- when there is a single instruction which does the job.
- .PP
- The
- .II efftab
- table is used when an expression is to be evaluated for its side effects,
- not its value.
- This occurs mostly for expressions which are statements, which have no
- value.
- Thus the code for the statement
- `a = b'
- need produce only the approoriate
- .II mov
- instruction, and need not leave the value of
- .II b
- in a register,
- while in the expression `a + (b = c)'
- the value of `b = c' will appear in a register.
- .PP
- All of the tables besides
- .II regtab
- are rather small, and handle only a relatively few special cases.
- If one of these subsidiary tables does not contain
- an entry applicable to the given expression tree,
- .II rcexpr
- uses
- .II regtab
- to put the value of the expression into a register
- and then fixes things up;
- nothing need be done when the table
- was
- .II efftab,
- but a
- .II tst
- instruction is produced when the table called for was
- .II cctab,
- and a
- .II mov
- instruction,
- pushing the register on the stack,
- when the table was
- .II sptab.
- .PP
- The
- .II rcexpr
- routine itself picks off some special
- cases, then calls
- .II cexpr
- to do the real work.
- .II Cexpr
- tries to find an entry applicable
- to the given tree in the given table, and returns \-1 if
- no such entry is found, letting
- .II rcexpr
- try again with a different table.
- A successful match yields a string
- containing both literal characters
- which are written out and pseudo-operations, or macros, which are expanded.
- Before studying the contents
- of these strings we will consider how table entries are matched
- against trees.
- .PP
- Recall that most non-leaf nodes in an expression tree
- contain the name of the operator,
- the type of the value represented, and pointers to the subtrees (operands).
- They also contain an estimate of the number of registers required to evaluate
- the expression, placed there by the expression-optimizer routines.
- The register counts are used to guide the code generation process,
- which is based on the Sethi-Ullman algorithm.
- .PP
- The main code generation
- tables consist of entries
- each containing an operator number and a pointer
- to a subtable for the corresponding operator.
- A subtable consists of a sequence
- of entries, each with a key describing certain properties of the
- operands of the operator involved; associated with the key is a code string.
- Once the subtable corresponding to the operator is found, the subtable
- is searched linearly until a key is found such that the properties demanded
- by the key are compatible with the operands of the tree node.
- A successful match returns the code string;
- an unsuccessful search, either for the operator in the main table
- or a compatble key in the subtable,
- returns a failure indication.
- .PP
- The tables are all contained in a file
- which must be processed to obtain an assembly language program.
- Thus they are written in a special-purpose language.
- To provided definiteness to the following discussion, here is an
- example of a subtable entry.
- .DS
- %n,aw
- F
- add A2,R
- .DE
- The `%' indicates the key;
- the information following (up to a blank line) specifies the code string.
- Very briefly, this entry is in the subtable
- for `+' of
- .II regtab;
- the key specifies that the left operand is any integer, character, or pointer
- expression,
- and the right operand is any word quantity which is directly addressible
- (e.g. a variable or constant).
- The code string calls for the generation of the code
- to compile the left (first) operand into the
- current register (`F')
- and then to produce an `add' instruction which adds the
- second operand (`A2') to the register (`R').
- All of the notation will be explained below.
- .PP
- Only three features of the operands are used in deciding
- whether a match has occurred.
- They are:
- .IP 1.
- Is the type of the operand compatible with that demanded?
- .RT
- .IP 2.
- Is the `degree of difficulty' (in a sense described below) compatible?
- .RT
- .IP 3.
- The table may demand that the operand have a `*'
- (indirection operator) as its highest operator.
- .PP
- As suggested above, the key for a subtable entry
- is indicated by a `%,' and a comma-separated pair
- of specifications for the operands.
- (The second specification is ignored for unary operators).
- A specification indicates
- a type requirement by including one of the following letters.
- If no type letter is present, any integer, character,
- or pointer operand will satisfy the requirement (not float, double, or long).
- .IP b
- A byte (character) operand is required.
- .RT
- .IP w
- A word (integer or pointer) operand is required.
- .RT
- .IP f
- A float or double operand is required.
- .RT
- .IP d
- A double operand is required.
- .RT
- .IP l
- A long (32-bit integer) operand is required.
- .PP
- Before discussing the `degree of difficulty' specification,
- the algorithm has to be explained more completely.
- .II Rcexpr
- (and
- .II cexpr)
- are called with a register number in which to place their result.
- Registers 0, 1, ... are used during evaluation of expressions;
- the maximum register which can be used in this way depends on the
- number of register variables, but in any event only registers
- 0 through 4 are available since r5 is used as a stack frame
- header and r6 (sp) and r7 (pc) have special
- hardware properties.
- The code generation routines assume that when called with register
- .II n
- as argument, they may use
- .II n+1,
- \&...
- (up to the first register variable)
- as temporaries.
- Consider the expression `X+Y', where both
- X and Y are expressions.
- As a first approximation, there are three ways of compiling
- code to put this expression in register
- .II n.
- .IP 1.
- If Y is an addressible cell,
- (recursively) put X into register
- .II n
- and add Y to it.
- .RT
- .IP 2.
- If Y is an expression that can be calculated in
- .II k
- registers, where
- .II k
- smaller than the number of registers available,
- compile X into register
- .II n,
- Y into register
- .II n+1,
- and add register
- .II n+1
- to
- .II n.
- .RT
- .IP 3.
- Otherwise, compile Y into register
- .II n,
- save the result in a temporary (actually, on the stack)
- compile X into register
- .II n,
- then add in the temporary.
- .PP
- The distinction between cases 2 and 3 therefore depends
- on whether the right operand can be compiled in fewer than
- .II k
- registers, where
- .II k
- is the number of free registers left after registers 0 through
- .II n
- are taken:
- 0 through
- .II n\-1
- are presumed to contain already computed temporary results;
- .II n
- will, in case 2,
- contain the value of the left operand while the right
- is being evaluated.
- .PP
- These considerations should make clear
- the specification codes for the degree of difficulty,
- bearing in mind that a number of special cases are also present:
- .IP z
- is satisfied when the operand is zero, so that special code
- can be produced for expressions like `x = 0'.
- .RT
- .IP 1
- is satisfied when the operand is the constant 1, to optimize
- cases like left and right shift by 1, which can be done
- efficiently on the PDP-11.
- .RT
- .IP c
- is satisfied when the operand is a positive (16-bit)
- constant; this takes care of some special cases in long arithmetic.
- .RT
- .IP a
- is satisfied when the operand is addressible;
- this occurs not only for variables and constants, but also for
- some more complicated constructions, such as indirection through
- a simple variable, `*p++' where
- .II p
- is a register variable (because of the PDP-11's auto-increment address
- mode), and `*(p+c)' where
- .II p
- is a register and
- .II c
- is a constant.
- Precisely, the requirement is that the operand refers to a cell
- whose address can be written as a source or destination of a PDP-11
- instruction.
- .RT
- .IP e
- is satisfied by an operand whose value can be generated in a register
- using no more than
- .II k
- registers, where
- .II k
- is the number of registers left (not counting the current register).
- The `e' stands for `easy.'
- .RT
- .IP n
- is satisfied by any operand.
- The `n' stands for `anything.'
- .PP
- These degrees of difficulty are considered to lie in a linear ordering
- and any operand which satisfies an earlier-mentioned requirement
- will satisfy a later one.
- Since the subtables are searched linearly,
- if a `1' specification is included, almost certainly
- a `z' must be written first to prevent
- expressions containing the constant 0 to be compiled
- as if the 0 were 1.
- .PP
- Finally,
- a key specification may contain a `*' which
- requires the operand to have an indirection as its leading operator.
- Examples below should clarify the utility of this specification.
- .PP
- Now let us consider the contents of the code string
- associated with each subtable entry.
- Conventionally, lower-case letters in this string
- represent literal information which is copied directly
- to the output.
- Upper-case letters generally introduce specific
- macro-operations, some of which may be followed
- by modifying information.
- The code strings in the tables are written with tabs and
- new-lines used freely to suggest instructions which will be generated;
- the table-compiling program compresses tabs (using the 0200 bit of the
- next character) and throws away some of the new-lines.
- For example the macro `F' is ordinarily written on a line by itself;
- but since its expansion will end with a new-line, the new-line
- after `F' itself is dispensable.
- This is all to reduce the size of the stored tables.
- .PP
- The first set of macro-operations is concerned with
- compiling subtrees.
- Recall that this is done by the
- .II cexpr
- routine.
- In the following discussion the `current register'
- is generally the argument register to
- .II cexpr;
- that is, the place where the result is desired.
- The `next register' is numbered one
- higher
- than the current register.
- (This explanation isn't fully true
- because of complications, described below, involving
- operations which require even-odd register pairs.)
- .IP F
- causes a recursive call to
- the
- .II rcexpr
- routine to compile code which places the value of the first (left)
- operand of the operator in the current register.
- .RT
- .IP F1
- generates code which places the value of the first operand in the
- next register.
- It is incorrectly used if there might be no next register;
- that is, if the degree of difficulty of the first operand is not `easy;'
- if not, another register might not be available.
- .RT
- .IP FS
- generates code which pushes the value of the first operand on the stack,
- by calling
- .II rcexpr
- specifying
- .II sptab
- as the table.
- .LP
- Analogously,
- .IP "S, S1, SS"
- compile the second (right) operand
- into the current register, the next register, or onto the stack.
- .LP
- To deal with registers, there are
- .IP R
- which expands into the name of the current register.
- .RT
- .IP R1
- which expands into the name of the next register.
- .RT
- .IP R+
- which expands into the the name of the current register plus 1.
- It was suggested above that this is the same as the next register,
- except for complications; here is one of them.
- Long integer variables have
- 32 bits and require 2 registers; in such cases the next register
- is the current register plus 2.
- The code would like to talk about both halves of the
- long quantity, so R refers to the register with the high-order part
- and R+ to the low-order part.
- .RT
- .IP R\-
- This is another complication, involving division and mod.
- These operations involve a pair of registers of which the odd-numbered
- contains the left operand.
- .II Cexpr
- arranges that the current register is odd;
- the R\- notation allows the code to refer to the next lower,
- even-numbered register.
- .LP
- To refer to addressible quantities, there are the notations:
- .IP A1
- causes generation of the address specified by the first operand.
- For this to be legal, the operand must be addressible; its
- key must contain an `a'
- or a more restrictive specification.
- .RT
- .IP A2
- correspondingly generates the address of the second operand
- providing it has one.
- .PP
- We now have enough mechanism to show a complete, if suboptimal,
- table for the + operator on word or byte operands.
- .DS
- %n,z
- F
- .sp 1
- %n,1
- F
- inc R
- .sp 1
- %n,aw
- F
- add A2,R
- .sp 1
- %n,e
- F
- S1
- add R1,R
- .sp 1
- %n,n
- SS
- F
- add (sp)+,R
- .DE
- The first two sequences handle some special cases.
- Actually it turns out that handling a right operand of 0
- is unnecessary since the expression-optimizer
- throws out adds of 0.
- Adding 1 by using the `increment' instruction is done next,
- and then the case where the right operand is addressible.
- It must be a word quantity, since the PDP-11 lacks an `add byte' instruction.
- Finally the cases where the right operand either can, or cannot,
- be done in the available registers are treated.
- .PP
- The next macro-instructions are conveniently
- introduced by noticing that the above table is suitable
- for subtraction as well as addition, since no use is made of the
- commutativity of addition.
- All that is needed is substitution of `sub' for `add'
- and `dec' for 'inc.'
- Considerable saving of space is achieved by factoring out
- several similar operations.
- .IP I
- is replaced by a string from another table indexed by the operator
- in the node being expanded.
- This secondary table actually contains two strings per operator.
- .RT
- .IP I\(fm
- is replaced by the second string in the side table
- entry for the current operator.
- .PP
- Thus, given that the entries for `+' and `\-' in the side table
- (which is called
- .II instab)
- are `add' and `inc,' `sub' and `dec'
- respectively,
- the middle of of the above addition table can be written
- .DS
- %n,1
- F
- I' R
-
- %n,aw
- F
- I A2,R
- .DE
- and it will be suitable for subtraction,
- and several other operators, as well.
- .PP
- Next, there is the question of character and floating-point operations.
- .IP B1
- generates the letter `b' if the first operand is a character,
- `f' if it is float or double, and nothing otherwise.
- It is used in a context like `movB1'
- which generates a `mov', `movb', or `movf'
- instruction according to the type of the operand.
- .RT
- .IP B2
- is just like B1 but applies to the second operand.
- .RT
- .IP BE
- generates `b' if either operand is a character
- and null otherwise.
- .RT
- .IP BF
- generates `f' if the type of the operator node itself is float or double,
- otherwise null.
- .PP
- For example, there is an entry in
- .II efftab
- for the `=' operator
- .DS
- %a,aw
- %ab,a
- IBE A2,A1
- .DE
- Note first that two key specifications
- can be applied to the same code string.
- Next, observe that when a word is assigned to a byte or to a word,
- or a word is assigned to a byte,
- a single instruction,
- a
- .II mov
- or
- .II movb
- as appropriate, does the job.
- However, when a byte is assigned to a word,
- it must pass through a register to implement the sign-extension rules:
- .DS
- %a,n
- S
- IB1 R,A1
- .DE
- .PP
- Next, there is the question of handling indirection properly.
- Consider the expression `X + *Y', where X and Y are expressions,
- Assuming that Y is more complicated than just a variable,
- but on the other hand qualifies as `easy' in the context,
- the expression would be compiled by placing the value of X in a register,
- that of *Y in the next register, and adding the registers.
- It is easy to see that a better job can be done
- by compiling X, then Y (into the next register),
- and producing the
- instruction symbolized by `add (R1),R'.
- This scheme avoids generating
- the instruction `mov (R1),R1'
- required actually to place the value of *Y in a register.
- A related situation occurs
- with the expression `X + *(p+6)', which
- exemplifies a construction
- frequent in structure and array references.
- The addition table shown above would produce
- .DS
- [put X in register R]
- mov p,R1
- add $6,R1
- mov (R1),R1
- add R1,R
- .DE
- when the best code is
- .DS
- [put X in R]
- mov p,R1
- add 6(R1),R
- .DE
- As we said above, a key specification for a code table entry
- may require an operand to have an indirection as its highest operator.
- To make use of the requirement,
- the following macros are provided.
- .IP F*
- the first operand must have the form *X.
- If in particular it has the form *(Y + c), for some constant
- .II c,
- then code is produced which places the value of Y in
- the current register.
- Otherwise, code is produced which loads X into the current register.
- .RT
- .IP F1*
- resembles F* except that the next register is loaded.
- .RT
- .IP S*
- resembles F* except that the second operand is loaded.
- .RT
- .IP S1*
- resembles S* except that the next register is loaded.
- .RT
- .IP FS*
- The first operand must have the form `*X'.
- Push the value of X on the stack.
- .RT
- .IP SS*
- resembles FS* except that it applies to the second operand.
- .LP
- To capture the constant that may have been skipped over
- in the above macros, there are
- .IP #1
- The first operand must have the form *X;
- if in particular it has the form *(Y + c) for
- .II c
- a constant, then the constant is written out,
- otherwise a null string.
- .RT
- .IP #2
- is the same as #1 except that the second operand is used.
- .LP
- Now we can improve the addition table above.
- Just before the `%n,e' entry, put
- .DS
- %n,ew*
- F
- S1*
- add #2(R1),R
- .DE
- and just before the `%n,n' put
- .DS
- %n,nw*
- SS*
- F
- add *(sp)+,R
- .DE
- When using the stacking macros there is no place to use
- the constant
- as an index word, so that particular special case doesn't occur.
- .PP
- The constant mentioned above can actually be more
- general than a number.
- Any quantity acceptable to the assembler as an expression will do,
- in particular the address of a static cell, perhaps with a numeric offset.
- If
- .II x
- is an external character array,
- the expression `x[i+5] = 0' will generate
- the code
- .DS
- mov i,r0
- clrb x+5(r0)
- .DE
- via the table entry (in the `=' part of
- .II efftab)
- .DS
- %e*,z
- F
- I'B1 #1(R)
- .DE
- Some machine operations place restrictions on the registers
- used.
- The divide instruction, used to implement the divide and mod
- operations, requires the dividend to be placed in the odd member
- of an even-odd pair;
- other peculiarities
- of multiplication make it simplest to put the multiplicand
- in an odd-numbered register.
- There is no theory which optimally accounts for
- this kind of requirement.
- .II Cexpr
- handles it by checking for a multiply, divide, or mod operation;
- in these cases, its argument register number is incremented by
- one or two so that it is odd, and if the operation was divide or mod,
- so that it is a member of a free even-odd pair.
- The routine which determines the number of registers required
- estimates, conservatively, that
- at least two registers are required for a multiplication
- and three for the other peculiar operators.
- After the expression is compiled,
- the register where the result actually ended up is returned.
- (Divide and mod are actually the same operation except for the
- location of the result).
- .PP
- These operations are the ones which cause results to end up in
- unexpected places,
- and this possibility adds a further level of complexity.
- The simplest way of handling the problem is always to move the
- result to the place where the caller expected it,
- but this will produce unnecessary register moves in many
- simple cases; `a = b*c' would generate
- .DS
- mov b,r1
- mul c,r1
- mov r1,r0
- mov r0,a
- .DE
- The next thought is used the passed-back
- information as to where the result landed to change the notion of the current
- register.
- While compiling the `=' operation above, which comes from a
- table
- entry
- like
- .DS
- %a,e
- S
- mov R,A1
- .DE
- it is sufficient to redefine the meaning of `R'
- after processing the `S' which does the multiply.
- This technique is in fact used; the tables are written in such a way
- that correct code is produced.
- The trouble is that the technique cannot be used in general,
- because it invalidates the count of the number of registers
- required for an expression.
- Consider just `a*b + X' where X is some expression.
- The algorithm assumes that the value of a*b,
- once computed, requires just one register.
- If there are three registers available, and X requires two registers to
- compute, then this expression will match a key specifying
- `%n,e'.
- If a*b is computed and left in register 1, then there are, contrary
- to expectations, no longer two registers available to compute X,
- but only one, and bad code will be produced.
- To guard against this possibility,
- .II cexpr
- checks the result returned by recursive calls which implement
- F, S and their relatives.
- If the result is not in the expected register, then the number of
- registers required by the other operand is checked;
- if it can be done using those registers which remain even
- after making unavailable the unexpectedly-occupied
- register, then
- the notions of the `next register' and possibly the `current
- register' are redefined.
- Otherwise a register-copy instruction is produced.
- A register-copy is also always produced
- when the current operator is one of those which have odd-even requirements.
- .PP
- Finally, there are a few loose-end macro operations
- and facts about the tables.
- The operators:
- .IP V
- is used for long operations.
- It is written with an address like a machine instruction;
- it expands into `adc' (add carry) if the operation
- is an additive operator,
- `sbc' (subtract carry) if the operation is a subtractive
- operator, and disappears, along with the rest of the line, otherwise.
- Its purpose is to allow common treatment of logical
- operations, which have no carries, and additive and subtractive
- operations, which generate carries.
- .RT
- .IP T
- generates a `tst' instruction if the first operand
- of the tree does not set the condition codes correctly.
- It is used with divide and mod operations,
- which require a sign-extended 32-bit operand.
- The code table for the operations contains an `sxt'
- (sign-extend) instruction to generate the high-order part of the
- dividend.
- .RT
- .IP H
- is analogous to the `F' and `S' macros,
- except that it calls for the generation of code for
- the current tree
- (not one of its operands)
- using
- .II regtab.
- It is used in
- .II cctab
- for all the operators which, when executed normally,
- set the condition codes properly according to the result.
- It prevents a `tst' instruction from being generated for
- constructions like `if (a+b) ...'
- since after calculation of the value of
- `a+b' a conditional branch can be written immediately.
- .PP
- All of the discussion above is in terms of operators with operands.
- Leaves of the expression tree (variables and constants), however,
- are peculiar in that they have no operands.
- In order to regularize the matching process,
- .II cexpr
- examines its operand to determine if it is a leaf;
- if so, it creates a special `load' operator whose operand
- is the leaf, and substitutes it for the argument tree;
- this allows the table entry for the created operator
- to use the `A1' notation to load the leaf into a register.
- .PP
- Purely to save space in the tables,
- pieces of subtables can be labelled and referred to later.
- It turns out, for example,
- that rather large portions of the
- the
- .II efftab
- table for the `=' and `=+' operators are identical.
- Thus `=' has an entry
- .DS
- %[move3:]
- %a,aw
- %ab,a
- IBE A2,A1
- .DE
- while part of the `=+' table is
- .DS
- %aw,aw
- % [move3]
- .DE
- Labels are written as `%[ ... : ]',
- before the key specifications;
- references
- are written
- with `% [ ... ]'
- after the key.
- Peculiarities in the implementation
- make it necessary that labels appear before references to them.
- .PP
- The example illustrates the utility
- of allowing separate keys
- to point to the same code string.
- The assignment code
- works properly if either the right operand is a word, or the left operand
- is a byte;
- but since there is no `add byte' instruction the addition code
- has to be restricted to word operands.
-