home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!littlei!hfglobe!chnews!chnews!doconnor
- From: doconnor@sedona.intel.com (Dennis O'Connor)
- Newsgroups: comp.arch
- Subject: Re: RISC goes CISC?
- Message-ID: <DOCONNOR.92Nov6164041@potato.sedona.intel.com>
- Date: 6 Nov 92 16:40:41 GMT
- References: <1992Nov6.092012.19239@rhein-main.de>
- Organization: Intel i960(tm) Architecture
- Lines: 61
- NNTP-Posting-Host: potato.intel.com
- In-reply-to: vhs@rhein-main.de's message of Fri, 6 Nov 92 09:20:12 GMT
-
-
- vhs@rhein-main.de (Volker Herminghaus-Shirai) writes:
- ] I was wondering lately...
- ] Wasn't one of the principles of RISC the principle of putting as much work
- ] as possible from execution to compile time?
-
- A better idea is to have the compiler do what it can do well, and let
- the hardware do what it can do well. For example, compilers can do
- register lifetime analysis well, but hardware does a much better
- job handling interlocks from operations with variable completion times.
-
- ] Under this assumption, how does e.g. SuperSPARC fit this principle? As far
- ] as I know the processor does a lot of cross-checking between pipelines,
- ] squashing instructions depending on non-trivial conditions, etc. Is that done
- ] just to maintain compatibility with previous implementations?
-
- It's an implementation trying to be a correct interpretter of it's ISA
- ( Instruction Set Architecture ). Otherwise, binaries for one Sparc
- might not work on another Sparc.
-
- ] Could, theoretically, the most recent compilers take care of this
- ] at compile-time, thus eliminating the need for run time
- ] checking (again theoretically since people want to run their old code).
-
- Depends. Generally, scoreboarding in the hardware produces better
- performance ( when coupled with compiler reordering of the instructions
- to minimize time spent waiting for results ) than adding the worst-case
- number of No-ops ( which will vary from system-to-system ).
-
- ] Wasn't the architectural(?) definition of using exactly one delay slot
- ] extremely short sighted, taking into account the much longer pipelines in
- ] current implementations?
-
- Gack. Are we talking branch delays or load result-ready delays ? Longer
- pipelines often don't affect the branch anyway ( if it's done in
- it's own peice of special hardware ) altho pipelining the I-Cache
- or instruction decode will. It all depends. Personally, I don't
- like architecturally-defined delay slots anyway : there are usually
- ways to get rid of them ( speculative issue for branches, load
- queues with hardware interlocks for loads ).
-
-
- ] Aren't pipelines necessary so that instructions that take several cycles
- ] (i.e. "complicated" instructions) can be issued one per cycle?
-
- Pipelines trade thruput for latency at the cost of complexity.
-
- ] Wouldn't it conform more to the RISC principle to keep instructions so
- ] simple that they only *need* one cycle to execute and concent rate on a *fast*
- ] memory interface instead?
-
- Actually, one of the claimed advantages of RISC architectures is that
- they are easy to pipeline. A machine with a 5-stage pipeline, if you
- took the pipelining out, would probably issue instructions at 1/3rd
- or so of it's original rate, due to it needing a much slower clock.
-
- ] Will RISC go the way that CISC went?
-
- RISC (in micros) went there (pipelined) first, actually.
- --
- Dennis O'Connor doconnor@sedona.intel.com
-