home *** CD-ROM | disk | FTP | other *** search
- Xref: sparky comp.arch:11955 comp.sys.intel:2842
- Path: sparky!uunet!cs.utexas.edu!qt.cs.utexas.edu!yale.edu!spool.mu.edu!agate!doc.ic.ac.uk!uknet!gdt!aber!fronta.aber.ac.uk!pcg
- From: pcg@aber.ac.uk (Piercarlo Grandi)
- Newsgroups: comp.arch,comp.sys.intel
- Subject: Re: Superscalar vs. multiple CPUs ?
- Message-ID: <PCG.92Dec27201257@decb.aber.ac.uk>
- Date: 27 Dec 92 20:12:57 GMT
- References: <PCG.92Dec11162630@aberdb.aber.ac.uk> <1992Dec21.134531.3253@athena.mit.edu>
- <PCG.92Dec23144916@decb.aber.ac.uk> <Bzpzwq.18q@news.udel.edu>
- Sender: news@aber.ac.uk (USENET news service)
- Reply-To: pcg@aber.ac.uk (Piercarlo Grandi)
- Organization: Prifysgol Cymru, Aberystwyth
- Lines: 41
- In-Reply-To: mccalpin@perelandra.cms.udel.edu's message of 23 Dec 92 16: 17:13 GMT
- Nntp-Posting-Host: decb.aber.ac.uk
-
- On 23 Dec 92 16:17:13 GMT, mccalpin@perelandra.cms.udel.edu (John D. McCalpin) said:
- Nntp-Posting-Host: perelandra.cms.udel.edu
-
- mccalpin> (Piercarlo Grandi) writes:
-
- solman> At any rate, I would claim that improvements in processor
- solman> performance should only be concerned with apps which are
- solman> presently processor limited, and many of these general purpose
- solman> codes just aren't.
-
- pcg> Uhm, memory is so cheap that many "general purpose" applications
- pcg> nowadays can be made entirely core resident, and a core resident
- pcg> program is 100% CPU bound. for example I see most compiles on my
- pcg> machine are CPU bound; except for reading in the source and writing
- pcg> out the code there is no IO activity whatsoever.
-
- mccalpin> I disagree with this distinction. A core resident program can
- mccalpin> be very seriously *memory* bound if the processor is spending
- mccalpin> most of its time locked down waiting for cache misses to be
- mccalpin> filled.
-
- Ah, of course, this is the next problem that a hyperscalar architecture
- has got to solve. Given the less than impressive rate of improvement in
- memory speeds, most fast CPUs today are memory starved, as you comment
- on later.
-
- This is another reason for which I think hyperscalar is premature:
- a vector instruction has the very nice property that it implies a very
- definite memory access pattern, as compared with a loop that does the
- same thing. And many important applications have FIFO data reference
- patterns, for which predictive memory accesses are essential, and
- adaptive ones, like those implied by a cache, are fatal:
-
- mccalpin> Many "cpu-bound" programs spend half or more of their time in
- mccalpin> this state, and it is very easy to accidentally build a
- mccalpin> program which spends 90% of its time waiting for cache line
- mccalpin> refills.
- --
- Piercarlo Grandi, Dept of CS, PC/UW@Aberystwyth <pcg@aber.ac.uk>
- E l'italiano cantava, cantava. E le sue disperate invocazioni giunsero
- alle orecchie del suo divino protettore, il dio della barzelletta
-