NetNews Usenet Archive 1992 #19

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #19 / NN_1992_19.iso / spool / comp / arch / 9127 < prev next >

Wrap

Internet Message Format | 1992-09-01 | 4.5 KB

Path: sparky!uunet!olivea!decwrl!sdd.hp.com!zaphod.mps.ohio-state.edu!darwin.sura.net!convex!news.utdallas.edu!corpgate!bnrgate!bmerh85!bcars64a!bqneh23!schow From: schow@bqneh23.bnr.ca (Stanley T.H. Chow) Newsgroups: comp.arch Subject: Re: trapping speculative ops (LONG) Message-ID: <1992Sep1.143155.636@bcars64a.bnr.ca> Date: 1 Sep 92 14:31:55 GMT References: <1992Aug31.224611.5196@odin.diku.dk> Sender: news@bcars64a.bnr.ca (Usenet News) Organization: Bell Northern Research Ltd, Ottawa Lines: 77 In article <1992Aug31.224611.5196@odin.diku.dk> thorinn@diku.dk (Lars Henrik Mathiesen) writes: >To summarize: trap bits are proposed to enable a compiler to move a >potentially trapping operation outside of a condition. As a special >case, this would allow many, if not all, of the same optimizations >that can be done if a load through a NULL pointer yields zeroes. [...] >The main problem seems to be with subroutine calls. To move an >instruction across a call, the compiler has to be able to determine >whether the result will be used. There are a number of cases where >that is possible; when it is not, we are no worse off than before. How about this for a possible solution: Assuming we have a single register that contains all the trap-bits, subroutine call should just save and restore this trap-bits register along with whichever subset of the normal registers. (Complication later). This just means that we treat the trap-bit as part of it's register. We can save/restore the whole trap register eventhough we don't save all the normal registers - this just means that upon return, the calling routine better not use any register that it shouldn't, a normal requirement. I.e., for callee-save registers, this method is precise in that both content and trap-condition are preserved; for caller-save registers, the subroutine is allowed to walk over the register but we choose to preserve the trap condition anyway. There is a nasty complication: are the trap-bits scoreboarded? If they are, they must be scoreboarded individually (but possibly the same scoreboard as their registers), then saving the register will unnecessarily stall the machine and eliminate the benifits of hosting the loads, etc. We must handle the cases where a speculative load into a caller-save register is not finished by the time the subroutine wants to write into that register (I assume the old operation is discarded and the new write goes into the pipeline rightaway). Given that we want to scoreboard the trap-bits, I suggest the simple solution is to have a new instruction that saves a single register (both content and trap-bit) into two words. (I guess we will need the converse to restore). This cannot introduce more stall than the normal scoreboard (especially if we use the same scoreboard :-) and reqires only trivial changes in compiler. In this scheme, the trap-bits are distributed and each bit is really associated with the register it traps, there is no need for a single trap-bits register. >If it is certain that a potentially trapped register value will be >used later, a subroutine can just be allowed to take a trap if it >tries to save it. (On such a trap, the debugger must unwind the call >stack to find the source instruction; but the compiler can easily >construct tables to allow this.) Software convention may define some >registers as more likely to be the destinations of slow instructions, >so that a subroutine can avoid stalls by saving them late. If the register is callee-save but speculatively loaded, it really is not nice to trap. In my scheme, the trap unwinding is exactly the same as before (ok, the mechanics is the same, but it becomes more important). For each access, the compiler knows if it is the first in that routine. If it is the frist read access, then the debugger should go up the call stack and look there. >On the other hand, if the value is known to be dead, the calling code >can move some dummy value into the register; this will reset the trap >bit, and write-write interlock will prevent outstanding operations >from setting it later. Again, software convention may define scratch >registers that do not have to be detoxed (the subroutine promises to >write them before reading them). Why should we ever do this? Since traps are user-initiated (by accessing the register) and dead registers should never be accessed. On the other hand, this is a great debugging aid. -- Stanley Chow InterNet: schow@BNR.CA Bell Northern Research UUCP: ..!uunet!bnrgate!bqneh3!schow (613) 763-2831 Me? Represent other people? Don't make them laugh so hard.