home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!olivea!decwrl!sdd.hp.com!zaphod.mps.ohio-state.edu!darwin.sura.net!convex!news.utdallas.edu!corpgate!bnrgate!bmerh85!bcars64a!bqneh23!schow
- From: schow@bqneh23.bnr.ca (Stanley T.H. Chow)
- Newsgroups: comp.arch
- Subject: Re: trapping speculative ops (LONG)
- Message-ID: <1992Sep1.143155.636@bcars64a.bnr.ca>
- Date: 1 Sep 92 14:31:55 GMT
- References: <1992Aug31.224611.5196@odin.diku.dk>
- Sender: news@bcars64a.bnr.ca (Usenet News)
- Organization: Bell Northern Research Ltd, Ottawa
- Lines: 77
-
- In article <1992Aug31.224611.5196@odin.diku.dk> thorinn@diku.dk (Lars Henrik Mathiesen) writes:
- >To summarize: trap bits are proposed to enable a compiler to move a
- >potentially trapping operation outside of a condition. As a special
- >case, this would allow many, if not all, of the same optimizations
- >that can be done if a load through a NULL pointer yields zeroes.
- [...]
- >The main problem seems to be with subroutine calls. To move an
- >instruction across a call, the compiler has to be able to determine
- >whether the result will be used. There are a number of cases where
- >that is possible; when it is not, we are no worse off than before.
-
- How about this for a possible solution: Assuming we have a single
- register that contains all the trap-bits, subroutine call should
- just save and restore this trap-bits register along with whichever
- subset of the normal registers. (Complication later). This just
- means that we treat the trap-bit as part of it's register. We can
- save/restore the whole trap register eventhough we don't save all
- the normal registers - this just means that upon return, the
- calling routine better not use any register that it shouldn't, a
- normal requirement. I.e., for callee-save registers, this method
- is precise in that both content and trap-condition are preserved;
- for caller-save registers, the subroutine is allowed to walk over
- the register but we choose to preserve the trap condition anyway.
-
- There is a nasty complication: are the trap-bits scoreboarded?
- If they are, they must be scoreboarded individually (but
- possibly the same scoreboard as their registers), then saving the
- register will unnecessarily stall the machine and eliminate the
- benifits of hosting the loads, etc. We must handle the cases where
- a speculative load into a caller-save register is not finished
- by the time the subroutine wants to write into that register (I
- assume the old operation is discarded and the new write goes into
- the pipeline rightaway).
-
- Given that we want to scoreboard the trap-bits, I suggest the
- simple solution is to have a new instruction that saves a single
- register (both content and trap-bit) into two words. (I guess we
- will need the converse to restore). This cannot introduce more
- stall than the normal scoreboard (especially if we use the same
- scoreboard :-) and reqires only trivial changes in compiler. In
- this scheme, the trap-bits are distributed and each bit is really
- associated with the register it traps, there is no need for a
- single trap-bits register.
-
- >If it is certain that a potentially trapped register value will be
- >used later, a subroutine can just be allowed to take a trap if it
- >tries to save it. (On such a trap, the debugger must unwind the call
- >stack to find the source instruction; but the compiler can easily
- >construct tables to allow this.) Software convention may define some
- >registers as more likely to be the destinations of slow instructions,
- >so that a subroutine can avoid stalls by saving them late.
-
- If the register is callee-save but speculatively loaded, it really
- is not nice to trap.
-
- In my scheme, the trap unwinding is exactly the same as before (ok,
- the mechanics is the same, but it becomes more important). For each
- access, the compiler knows if it is the first in that routine. If it
- is the frist read access, then the debugger should go up the call
- stack and look there.
-
- >On the other hand, if the value is known to be dead, the calling code
- >can move some dummy value into the register; this will reset the trap
- >bit, and write-write interlock will prevent outstanding operations
- >from setting it later. Again, software convention may define scratch
- >registers that do not have to be detoxed (the subroutine promises to
- >write them before reading them).
-
- Why should we ever do this? Since traps are user-initiated (by accessing
- the register) and dead registers should never be accessed. On the other
- hand, this is a great debugging aid.
-
- --
- Stanley Chow InterNet: schow@BNR.CA
- Bell Northern Research UUCP: ..!uunet!bnrgate!bqneh3!schow
- (613) 763-2831
- Me? Represent other people? Don't make them laugh so hard.
-