home *** CD-ROM | disk | FTP | other *** search
- Xref: sparky comp.arch:11705 comp.sys.intel:2734
- Path: sparky!uunet!pipex!bnr.co.uk!uknet!gdt!aber!aberfa!pcg
- From: pcg@aber.ac.uk (Piercarlo Grandi)
- Newsgroups: comp.arch,comp.sys.intel
- Subject: Re: Superscalar vs. multiple CPUs ?
- Message-ID: <PCG.92Dec13164348@aberdb.aber.ac.uk>
- Date: 13 Dec 92 16:43:48 GMT
- References: <1992Dec7.012026.11482@athena.mit.edu>
- <1992Dec8.000357.26577@newsroom.utas.edu.au>
- <PCG.92Dec9154602@aberdb.aber.ac.uk>
- <1992Dec9.211737.23911@walter.cray.com> <Bz25n0.202@metaflow.com>
- Sender: news@aber.ac.uk (USENET news service)
- Reply-To: pcg@aber.ac.uk (Piercarlo Grandi)
- Organization: Prifysgol Cymru, Aberystwyth
- Lines: 47
- In-Reply-To: rschnapp@metaflow.com's message of 10 Dec 92 19: 18:35 GMT
- Nntp-Posting-Host: aberdb
-
- On 10 Dec 92 19:18:35 GMT, rschnapp@metaflow.com (Russ Schnapp) said:
-
- rschnapp> (Bradley R. Carlile) writes:
-
- Bradley> The limit of 4 or 6 may be the limit programming a superscalar
- Bradley> chip like by simply letting the the chip group instructions
- Bradley> together. *However* if one uses the technique of software
- Bradley> pipelining like we used to use on the VLIW machines of
- Bradley> yesteryear FPS-120B (before 1976), FPS-164, and the FPS-264
- Bradley> (1985?). These machines could issue up to 10 instruction every
- Bradley> cycle {I wrote software for 7 years that used these
- Bradley> instructions}.
-
- rschnapp> Don't forget about out-of-order (i.e., dataflow) execution
- rschnapp> techniques. You can gain plenty of additional fine-grain
- rschnapp> parallelism.
-
- Ah, there are indeed many interesting tricks to support high levels of
- micro parallelism *in the implementation*. To me the really interesting
- problem is not how to do it, though, it is whether it is worth doing it,
- and for which classes of codes.
-
- I have yet to find evidence that *general purpose* codes have an
- intrinsic degree of micro parallelizable operation (let's call it
- superscalarity) that makes it worthwhile to have a degree of parallel
- *instruction issue* within a single stream not greater than 2-4.
-
- The codes that can be easily micro-parallelized are usually *special
- purpose*, in that they have a structure (e.g. FIFO reference patterns)
- that are best suited to a SIMD/vector processor approach.
-
- Otherwise *general purpose* codes have a degree of macro parallelism
- (let's call it multithreading) that can be best exploited with multiple
- CPUs, and that can exploited up to a degree of parallel *instruction
- streams* with independent contexts not greater than 2-4 (again).
-
- Then are are codes that can be easily macro-parallelized, and these
- are again *special purpose*, in that they have a structure (e.g.
- LIFO reference patterns) that are best suited to a MIMD approach.
-
- Naturally all this depends on how you define a "code"; for example a
- timesharing system might be considered a single, highly parallelizable
- code, or as a collection of fairly hard to parallelize different codes.
- --
- Piercarlo Grandi, Dept of CS, PC/UW@Aberystwyth <pcg@aber.ac.uk>
- E l'italiano cantava, cantava. E le sue disperate invocazioni giunsero
- alle orecchie del suo divino protettore, il dio della barzelletta
-