NetNews Usenet Archive 1992 #31

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #31 / NN_1992_31.iso / spool / comp / arch / 12012 < prev next >

Wrap

Internet Message Format | 1992-12-31 | 3.4 KB

Xref: sparky comp.arch:12012 comp.sys.intel:2854 Path: sparky!uunet!pipex!bnr.co.uk!uknet!gdt!aber!fronta.aber.ac.uk!pcg From: pcg@aber.ac.uk (Piercarlo Grandi) Newsgroups: comp.arch,comp.sys.intel Subject: Re: Superscalar vs. multiple CPUs ? Message-ID: <PCG.92Dec31115803@decb.aber.ac.uk> Date: 31 Dec 92 11:58:03 GMT References: <PCG.92Dec11162630@aberdb.aber.ac.uk> <1992Dec21.134531.3253@athena.mit.edu> <PCG.92Dec23144916@decb.aber.ac.uk> <Bzpzwq.18q@news.udel.edu> <PCG.92Dec27201257@decb.aber.ac.uk> <38194@cbmvax.commodore.com> Sender: news@aber.ac.uk (USENET news service) Reply-To: pcg@aber.ac.uk (Piercarlo Grandi) Organization: Prifysgol Cymru, Aberystwyth Lines: 54 In-Reply-To: jesup@cbmvax.commodore.com's message of 29 Dec 92 20: 02:47 GMT Nntp-Posting-Host: decb.aber.ac.uk On 29 Dec 92 20:02:47 GMT, jesup@cbmvax.commodore.com (Randell Jesup) said: jesup> (Piercarlo Grandi) writes: pcg> This is another reason for which I think hyperscalar is premature: pcg> a vector instruction has the very nice property that it implies a pcg> very definite memory access pattern, as compared with a loop that pcg> does the same thing. And many important applications have FIFO data pcg> reference patterns, for which predictive memory accesses are pcg> essential, and adaptive ones, like those implied by a cache, are pcg> fatal: jesup> This is one reason I've been advocating smarter caches, jesup> particularily the ability to do predictive pre-fetching. There jesup> are a couple of ways to set this up: [ ... ] But should they be called 'caches' then? If they become memory queues, should not they be called 'memory queues'? Or maybe 'vector registers'? You seem to agree that making a cache double as an implicit set of memory queues a la RS/6000 can be fatal; but you still seem to like making the cache become an *explicit* set of memory queues. Why not separate the two? They are used for such different purposes, and in such different ways! I find it fairly odd to have a single bit of hw be both a vast LIFO repository of recently accessed data and a small funnel for prefetching. Maybe you could argue that in this way the queues would not be exposed, other than in the instruction set. But the trend I seem to understand is towards exposing the naked hardware to the lusty gaze of the compiler ever more :-). jesup> 1. Instruction sets address, bound, and perhaps amount to jesup> fetch. [ ... ] jesup> 2. Load instruction encodes prefetch-enable and bounding jesup> size in the instruction. jesup> in the instruction. [ ... ] jesup> 3. Instruction sets register, bound and perhaps size, and jesup> any load relative to that register causes a prefetch [ ... ] jesup> 4. Prefetch instruction is executed earlier in the jesup> instruction stream to fetch a location. [ ... ] These look a fairly Ok set of ideas, but in some sense they are all too "high level". Suppose that there is effect several memory queues; then multiple prefetch streams could be outstanding. With the four ideas above the CPU would have to mask completely this. It takes a bit of bookkeeping to do this. With explicit queues (or something like vector registers) things are a bit simpler. On the other hand maybe the difficulty is less than that I imagine. -- Piercarlo Grandi, Dept of CS, PC/UW@Aberystwyth <pcg@aber.ac.uk> E l'italiano cantava, cantava. E le sue disperate invocazioni giunsero alle orecchie del suo divino protettore, il dio della barzelletta