home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!cis.ohio-state.edu!sample.eng.ohio-state.edu!purdue!mentor.cc.purdue.edu!pop.stat.purdue.edu!hrubin
- From: hrubin@pop.stat.purdue.edu (Herman Rubin)
- Newsgroups: comp.arch
- Subject: Re: trapping speculative ops
- Message-ID: <BtLD9x.IBA@mentor.cc.purdue.edu>
- Date: 26 Aug 92 12:56:20 GMT
- References: <l8j0qkINNikb@exodus.Eng.Sun.COM> <BtEzrK.Jso.2@cs.cmu.edu> <GLEW.92Aug25180333@pdx007.intel.com>
- Sender: news@mentor.cc.purdue.edu (USENET News)
- Organization: Purdue University Statistics Department
- Lines: 56
-
- I seem to have missed the original article by Mr. Lindsay. But this has
- been a major problem since the possibility of any parallelism has occurred.
-
-
- > But this is just a specific optimization. The broader question is,
- > waht about traps while executing speculatively? There seem to be two
- > answers:
-
- > 1) Let the hardware do all the speculative execution. This puts it on
- > the hardware's head to not trap until it knows it really should have.
-
- > 2) Allow compiled code a way to postpone consequences until it
- > wants to know. This covers e.g.
-
- > if( b != 0 ) c = a/b;
-
- > and an assortment of other cases.
-
- > The problem with (1) is that it makes for complex hardware. The
- > problem with (2) is that instruction sets are generally not set
- > up to allow such leeway.
-
- > Don D.C.Lindsay Carnegie Mellon Computer Science
-
- The real problem is that if (1) is done, then the hardware may have to
- handle a large number of branches, and spend most of its resources on
- managing the large variety of cases. I believe that the attempts to
- do so have not been too successful.
-
- But it often is necessary to put in blocks even when this is not the case;
- I recall the following example:
-
- x = a/b;
- if(y<0) x=z;
-
- where the division takes long enough that the result of the first statement
- is produced after the result of the second. Whatever the hardware protocol
- or the rearrangement of instructions, considerable inefficiency can occur.
- This means that alternative (2) causes slow execution. With the way division
- is these days relative to other operations, doing the division first
- essentially puts it in the background, so it does not take up much time.
- But delaying the issuance of branches does not help, either.
-
- I see no simple solution. In many cases, compiler-programmer interaction may
- help, but often not even this. If the code is necessarily branching, in many
- cases a VCISC approach may achieve much, as some types of branches are very
- fast in hardware.
-
- One place where this is extremely important is SIMD, where a fairly simple
- instruction, with the simple branches taken in hardware, gets to be a major
- problem in turning processors on and off.
- --
- Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
- Phone: (317)494-6054
- hrubin@pop.stat.purdue.edu (Internet, bitnet)
- {purdue,pur-ee}!pop.stat!hrubin(UUCP)
-