home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!stanford.edu!bcm!rice!cliffc
- From: cliffc@rice.edu (Cliff Click)
- Newsgroups: comp.arch
- Subject: trapping speculative ops (LONG)
- Message-ID: <CLIFFC.92Aug28085924@antigone.rice.edu>
- Date: 28 Aug 1992 14:59:24 GMT
- References: <BtEzrK.Jso.2@cs.cmu.edu> <GLEW.92Aug25180333@pdx007.intel.com>
- <BtLD9x.IBA@mentor.cc.purdue.edu>
- <1992Aug28.013846.11326@athena.mit.edu>
- Sender: news@rice.edu (News)
- Organization: Center for Research on Parallel Computations
- Lines: 142
- In-Reply-To: jfc@athena.mit.edu's message of 28 Aug 1992 01:38:46 GMT
-
-
- Since noone complained about my digest-reply format, I'll repeat it again.
-
- Also, here is a challange to the hardware jocks out there:
-
- Am I blowing smoke? i.e. does your main pipeline(s) simplify enough when
- you trap at the "start" of an instruction instead of the middle to justify
- carrying around all those extra trap bits?
-
- I think that speculative loads would help C-like programs some (lots of loads
- and branches).
-
- I know that prefetch can help scientific codes running machines with distant
- memories (the compiler can often predict address traces for hundreds of
- cycles).
-
- Here goes...
- ---------------------
- <John Carr>
- | [... add conditional operations to avoid branches... ]
-
- <Cliff Click>
- Sounds good, but orthogonal.
- The goal was to be able to hoist potentially trapping operations above a
- guard test.
-
- <John Carr>
- | [... Keep the trap flags invisible... ]
-
- <Cliff Click>
- Keeping the trap flags hidden removes the compiler's option to hoist
- possibly trapping operations above a guard test.
-
- ---------------------
- <Stanley T.H. Chow>
- | But if the prefetch was not speculative, it would be nice to have the load
- | go forward. For page fault, it doesn't make much difference, but a TLB
- | miss could well have been hidden.
-
- <Cliff Click>
- Ok, let TLB's progress until it's not convenient anymore. i.e. if the TLB
- miss can handled without going to the OS, do it. Otherwise set the
- appropiate "fail" bits in the saved trap-state register and return from the
- TLB handler. If the register is not used (speculative prefetch) there's no
- big loss. If the register IS used, a more full-bodied trap handler can do
- the fixup (which might require growing the stack/process space, or a
- segmentation fault or a page fault).
-
- <Stanley T.H. Chow>
- | Conceivabily, one may want control over the cache behaviour as well.
-
- <Cliff Click>
- Cache behavior needs to be controlled for good memory performance for many
- scientific programs, irregardless of the trapping behavior. Perhaps you
- could marry your favorite cache control technique with the discussed trap
- behavior.
-
- ---------------------
- <David Chase>
- | First, speculative division is probably not an interesting case.
-
- <Cliff Click>
- It was used as the canonical trapping example.
- I'll stick to the load/store trap issue from here on.
-
- <David Chase>
- | As far as loads and stores go, I am similarly mystified by the insistence
- | that each and every trap from the source program be preserved in
- | "very-optimized" code.
-
- <Cliff Click>
- How else do you handle page faults? TLB faults?
- In your "very-optimized" code wild stores and loads can proceed until your
- code and data rot enough to die. I prefer wild stores and loads to stop
- the process at once.
-
- The page 0 hack is a clever technique for hoisting past a NULL test.
- But it prevents the hardware from detecting a common bug (reading from a
- NULL pointer).
-
- If you need "performance at all costs" then by all means do it.
-
- <David Chase>
- | If the OS could be convinced to just not report illegal loads
- | (emulating them as returning zero or NaN, for instance) then loads
- | could be hoisted much more frequently.
-
- <Cliff Click>
- The OS will still handle page faults when the load is issued, instead of
- when the load results are required (if at all).
-
- <David Chase>
- | Note that the hardware support for what I described above is pretty
- | damn cheap. Note that no changes are needed to the architecture.
-
- <Cliff Click>
- "No change" is exceptionally cheap :-).
-
- ---------------------
- <Bob Alverson>
- | Let's see. You'll also probably want special loads and stores that
- | preserve those trap bits.
-
- <Cliff Click>
- Why?
-
- <Bob Alverson>
- | Speaking of context switch, make sure you can restore the poison on *all*
- | the registers or have some that you know aren't poisoned (like the one
- | holding the stack pointer).
-
- <Cliff Click>
- All the trap bits can be read/written via a special control register.
-
- <Bob Alverson>
- | Oh, and those folks with conditional moves or conditional operations may
- | want to have the poison only conditional "tasted," to keep jackpot cases
- | from obfuscating compiler transformation decisions.
-
- <Cliff Click>
- Sounds like conditionals on your conditionals.
- Ask a hardware jock if this is cost effective.
-
- <Bob Alverson>
- | You might also want a register to register move that propagates poison,
- | so the register allocator can shuffle values around with wild abandon.
-
- <Cliff Click>
- Sounds kinky :-). I'm sure the compiler can deal with it, but do the
- hardware jocks want to?
-
- ---------Authors----------
- Bob Alverson bob@tera.com
- John Carr jfc@athena.mit.edu
- David Chase chased@rbbb.Eng.Sun.COM
- Stanley Chow schow@BNR.CA
- Cliff Click cliffc@cs.rice.edu
- --------------------------
- --
- The Sparc ABI had the most brain-damaged calling convention I've ever seen.
- It's probably better now but reminiscing gives me something to complain about.
- Cliff Click (cliffc@cs.rice.edu) | Disclaimer: My lawyer made me say it.
-