home *** CD-ROM | disk | FTP | other *** search
- Xref: sparky comp.arch:12012 comp.sys.intel:2854
- Path: sparky!uunet!pipex!bnr.co.uk!uknet!gdt!aber!fronta.aber.ac.uk!pcg
- From: pcg@aber.ac.uk (Piercarlo Grandi)
- Newsgroups: comp.arch,comp.sys.intel
- Subject: Re: Superscalar vs. multiple CPUs ?
- Message-ID: <PCG.92Dec31115803@decb.aber.ac.uk>
- Date: 31 Dec 92 11:58:03 GMT
- References: <PCG.92Dec11162630@aberdb.aber.ac.uk> <1992Dec21.134531.3253@athena.mit.edu>
- <PCG.92Dec23144916@decb.aber.ac.uk> <Bzpzwq.18q@news.udel.edu>
- <PCG.92Dec27201257@decb.aber.ac.uk> <38194@cbmvax.commodore.com>
- Sender: news@aber.ac.uk (USENET news service)
- Reply-To: pcg@aber.ac.uk (Piercarlo Grandi)
- Organization: Prifysgol Cymru, Aberystwyth
- Lines: 54
- In-Reply-To: jesup@cbmvax.commodore.com's message of 29 Dec 92 20: 02:47 GMT
- Nntp-Posting-Host: decb.aber.ac.uk
-
- On 29 Dec 92 20:02:47 GMT, jesup@cbmvax.commodore.com (Randell Jesup) said:
-
- jesup> (Piercarlo Grandi) writes:
-
- pcg> This is another reason for which I think hyperscalar is premature:
- pcg> a vector instruction has the very nice property that it implies a
- pcg> very definite memory access pattern, as compared with a loop that
- pcg> does the same thing. And many important applications have FIFO data
- pcg> reference patterns, for which predictive memory accesses are
- pcg> essential, and adaptive ones, like those implied by a cache, are
- pcg> fatal:
-
- jesup> This is one reason I've been advocating smarter caches,
- jesup> particularily the ability to do predictive pre-fetching. There
- jesup> are a couple of ways to set this up: [ ... ]
-
- But should they be called 'caches' then? If they become memory queues,
- should not they be called 'memory queues'? Or maybe 'vector registers'?
-
- You seem to agree that making a cache double as an implicit set of
- memory queues a la RS/6000 can be fatal; but you still seem to like
- making the cache become an *explicit* set of memory queues.
-
- Why not separate the two? They are used for such different purposes, and
- in such different ways! I find it fairly odd to have a single bit of hw
- be both a vast LIFO repository of recently accessed data and a small
- funnel for prefetching.
-
- Maybe you could argue that in this way the queues would not be exposed,
- other than in the instruction set. But the trend I seem to understand
- is towards exposing the naked hardware to the lusty gaze of the compiler
- ever more :-).
-
- jesup> 1. Instruction sets address, bound, and perhaps amount to
- jesup> fetch. [ ... ]
- jesup> 2. Load instruction encodes prefetch-enable and bounding
- jesup> size in the instruction.
- jesup> in the instruction. [ ... ]
- jesup> 3. Instruction sets register, bound and perhaps size, and
- jesup> any load relative to that register causes a prefetch [ ... ]
- jesup> 4. Prefetch instruction is executed earlier in the
- jesup> instruction stream to fetch a location. [ ... ]
-
- These look a fairly Ok set of ideas, but in some sense they are all too
- "high level". Suppose that there is effect several memory queues; then
- multiple prefetch streams could be outstanding. With the four ideas
- above the CPU would have to mask completely this. It takes a bit of
- bookkeeping to do this. With explicit queues (or something like vector
- registers) things are a bit simpler. On the other hand maybe the
- difficulty is less than that I imagine.
- --
- Piercarlo Grandi, Dept of CS, PC/UW@Aberystwyth <pcg@aber.ac.uk>
- E l'italiano cantava, cantava. E le sue disperate invocazioni giunsero
- alle orecchie del suo divino protettore, il dio della barzelletta
-