home *** CD-ROM | disk | FTP | other *** search
- Xref: sparky comp.sys.sgi:18080 comp.lang.forth:3634 comp.benchmarks:1875
- Newsgroups: comp.sys.sgi,comp.lang.forth,comp.benchmarks
- Path: sparky!uunet!europa.asd.contel.com!emory!sol.ctr.columbia.edu!venezia!penev
- From: penev@venezia (Penio Penev)
- Subject: Re: Comparison of R3000 and Intel 386 and 486's
- References: <1992Dec16.055418.12364@ringer.cs.utsa.edu>
- Sender: nobody@ctr.columbia.edu
- Organization: Rockefeller University
- Date: Wed, 16 Dec 1992 11:08:36 GMT
- X-Newsreader: TIN [version 1.1 PL6]
- Message-ID: <1992Dec16.110836.7376@sol.ctr.columbia.edu>
- Reply-To: penev@venezia.rockefeller.edu
- X-Posted-From: venezia.rockefeller.edu
- NNTP-Posting-Host: sol.ctr.columbia.edu
- Lines: 64
-
- David M. Senseman (senseman@ethel.brainlab.utsa.edu) wrote:
- :
- : Can anyone send me "speed comparisons" between R3000 and Intel's 386/486
- : chips? (yeah, I know, ......).
-
- I just wrote FORTH (assembler, interpreter, compiler) for the R3000
- (actually fort the MIPS RISC family). I did this for the i486 (and
- 386) protected mode in April, so my memories are fresh.
-
- I'll compare them in the 33MHz version, running out of cache, numbers
- are in clocks. Number in brackets are if You cannot fill the delay slot
- (which is _rarely_ the case)
-
- i486 Rx000
- Memory fetches: 1 1(2)
- Register moves: 1 1
- Arithmetics
- out of regs: 2 1
- out of memory 3 1+1
- Jumps 3 1(2)
- Subroutine call/ret 6 6, sometimes 4
- Memory moves: (5?6?7) 6 ( per 32-bit word)
- Loops (5?) 2
- Stack operations (1?2)/word 1/word + 1
- Multiply/divide 11mul/?div 12m/35d +1-2 for getting 1-2 results to regs
- FPU +/- ? 2s/3d (R3000)
- FPU * ? 4-single, 5-double
- FPU / ? 12s/19d
- moves to/from FPU (10?) 1 + 1-2 if convert is needed.
- Cache miss penalty to
- secondary cash 3 no such (R3000)
- ? (R4000)
- Memory access out
- of cache memory and bus dependent
-
- The ?s are lapses of memory.
-
- The R3000 can acomodate 32+32 Kb cache, which _greatly_ helps.
- The i486 has 8 Kb. (which uses somewhat better, because of the shorter instructions size)
-
- The larger cache on the Rx000 allows it to execute out of cache for the greatest
- part of the time. If The i486 runs out of cache, it slows 2-4 times.
- Rx000 has a lot of registers, which act as a very selective and effective cache.
-
- The actual gain depends a lot on the type of task the processor is running.
- One should consider the predominance of operation in his application.
- For mine - FORTH, which is a very structured and stack-intensive language,
- I find it somewhere between 1.5 and 3.
- For one of my scientific tasks (Monte Carlo simulation with lots of pointers and
- tables >8K and <32K), it is about 5-7.
-
- For integer multiply/divide the i486 is a lot more conviniant. it has a 64/32=32
- bit divide mode, which makes 96(128)/32=64(96) division of very high precision
- integers a piece of cake. I love this instructions. On the Rx000 You must
- use the brute-force binary divide algorithm, which is slow as hell (comparatively).
-
- For Float, I think, that the R3000 is a bit faster, but less than 2 times.
-
- I'd be interested to hear other comments, especially from folks, who has
- programmed both processors.
-
- -- Penio.
-
-
-