home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
The C Users' Group Library 1994 August
/
wc-cdrom-cusersgrouplibrary-1994-08.iso
/
vol_300
/
357_01
/
cstar1.exe
/
ARITH.DOC
< prev
next >
Wrap
Text File
|
1991-06-20
|
8KB
|
154 lines
ARITH.DOC: About arithmetic in the CSTAR translator
June 25, 1991
NOTE: stuff in brackets [ ] pertains to things that need to be done but
aren't implemented yet.
Arithmetic involving only integers is strictly typed without automatic
conversions. Mixed-mode expressions must be written with casts; this
is good practice anyhow, and since C types are slightly less restrictive
than Pascal types, it is eminently feasible in all cases. Note also that
longer results will *not* be silently truncated and blithely stuffed
shorter variables.
If you are writing too many casts, it may pay to use longer forms for
some of your variables. The 68000 moves words as easily as bytes,
and the penalties for using long words are fairly modest except
in multiplication or division.
In order to require an intermediate result longer than its result, a
mixed-mode expression must contain a division or right shift of a
quantity which is wider than the result, since those are the only
operations in which bits can affect other bits to their right. (In
such cases, CSTAR may generate no code for the cast if it is inherent
in the operator; the multiplication of two ints to get a long has to
be written:
long_var = (long)int_1 * (long)int_2;
but it would be silly to extend the integers and call the long
multiply functions.)
Characters are treated as 8-bit integers and are never automatically
extended except for use as function arguments. A mixed char/int
expression should be written exactly like its int/long analog in
terms of form and of placement of casts. In the absence of casts,
all char results are 8 bits wide. Thus:
(int)(char_1 * char_2)
means multiply char_1 by char_2 and extend the 8-BIT result to 16 bits.
This treatment minimizes the generation of silly code in trivial
comparisons and range checks on characters, which, in most programs,
is the only use of character arithmetic.
On the 68000, I recommend against using single char variables
(aggregates of char are a different matter) to "save space"
if they will appear in mixed-mode expressions, since the code to
extend them will in all cases occupy more space than the one byte
per variable so saved.
[In order to avoid emulating the speed of a Vic-20 with your target 68000
system, FLOAT variables are NOT automatically extended to DOUBLE
everywhere unless the code is being compiled for the 68881 or
some such chip.]
Pointer arithmetic involves the addition of a pointer to an integer
(of any size) or the subtraction of two pointers to yield an integer.
In CSTAR, it takes advantage of the 68000 address modes under most
circumstances, in order to get good code.
Pointer addition involves two operands of different types, so it is
truly mixed mode. In CSTAR, scaling is done to whatever width is
necessary to correctly hold the result of that scaling. In practice,
most scaling produces a long result; the idiosyncrasies of the 68000
A registers assist with that. The result of scaling a char (i.e. an
8-bit integer) will be integer if the scale factor is small enough
for that to be correct.
In a complex expression, the C-language operator which is spelled +
is not associative, and this may lead to surprises. It is not
associative because it is polymorphic, and the form it takes--pointer
result or integer result--depends on its operands and therefore
on its associations.
The surprise is that in the expression:
int_1 + int_2 + pointer;
the first addition is of the integer result form, and it could overflow
and provide an incorrect operand to the pointer addition, while
pointer + int_1 + int_2;
does two additions of the pointer result form, and the analogous overflow
cannot happen. (In both cases, of course, a 24-bit overflow out of the
68000 address space could occur.) To repeat: when pointers are involved,
addition is commutative but not associative.
In the case of array subscripts, the association of the effective additions
is ambiguous. That is,
p [i + j]
could be treated as *(p + i + j) or as *(p + (i + j)). In order to
produce the most conservative results, CSTAR uses the first association.
A command-line switch is available to choose the second form instead.
If it is necessary to rigorously control overflow behavior in an expression,
(as to maintain a circular buffer in a fast interrupt-service routine
by allowing the pointer to overflow to zero) the expression should be
written out with * and +, and eschew [], in order to avoid any dependency
on how subscript calculations are optimized.
[The DRI compiler is unpredictable and often sets up to do the first association--
and then truncates the result after scaling! If someone can tell us
that p[i+j] "really means" one form or the other, we can change the
default flag setting.]
The 68000 indexed addressing modes use a pointer and a signed, scaled
integer or constant. The surprise that results is that an unsigned
word-size pointer has to be extended to long to use it correctly in
an address mode or to add it correctly to an address register. If
this is a problem in critical code, we suggest you set up a
distinctive macro that does an int cast, which generates no code
itself, and will suppress the extension, and use it in critical code
if you are *absolutely sure* the cast will not overflow. (If you
get that wrong, you will be stung by a nasty pointer bug.)
The constant in most addressing modes is limited to 32K, so absolute
(global/external/static) variables cannot be accessed with those modes; nor
can frame variables beyond 32K. On those machines whose operating
systems limit memory chunks to 32K or less anyhow, the latter is not a
problem. Such long-range accesses may take longer than short-frame
accesses, and they may do actual addition that would not occur with an
analogous short-frame access.
Before an integer is used in an address mode, it has to be scaled, unless
the address is a pointer to char. CSTAR does scaling by 2 by doubling in
an A register if one is available, or a D register if the variable is
already in one. Scaling by other powers of 2 is by shifting, and other
scaling is by multiplication. Scaling a long variable involves a long
multiplication; be aware that this takes all day (worst case: function
call and return plus the three mul instructions and miscellany.)
When CSTAR needs an address, or an address has overgrown the complexity
of a 68000 addressing mode, code is generated to do some sort of real
addition with ADDA or LEA. For reasonable expressions, this code is
very close to what you would write in assembler, and occasionally,
it is better when a trick shows up that you might not have noticed.
The cases that are only "very close" involve 0(An, Rn), which
under some circumstances can be replaced by rearranging the surrounding
code; the result would run two clock cycles faster--and possibly be one
word longer. [Maybe there should be a peephole switch.]
[Not implemented yet]
In an associated chain of pointer additions (containing one pointer
operand and two or more integer operands), CSTAR may, at its option,
postpone the scaling operation and perform it just once. The result will
be rigorously guaranteed to be identical to what would have been
obtained had the postponement not taken place. If necessary, extending
casts and/or A register addition will be generated.