home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!wupost!sdd.hp.com!spool.mu.edu!olivea!sgigate!odin!twilight!zuni!anchor!olson
- From: olson@anchor.esd.sgi.com (Dave Olson)
- Newsgroups: comp.sys.sgi
- Subject: Re: Indigo R4000 vs. R3000 4D/35
- Message-ID: <uk0r1d4@zuni.esd.sgi.com>
- Date: 8 Jan 93 08:50:10 GMT
- References: <1993Jan7.142834.7127@jarvis.csri.toronto.edu> <1993Jan7.213506.6454@Princeton.EDU>
- Sender: news@zuni.esd.sgi.com (Net News)
- Organization: Silicon Graphics, Inc. Mountain View, CA
- Lines: 52
-
- In <1993Jan7.213506.6454@Princeton.EDU> awolfe@moo.Princeton.EDU (Andrew Wolfe) writes:
- | In article <1993Jan7.142834.7127@jarvis.csri.toronto.edu>, corkum@csri.toronto.edu (Brent Thomas Corkum) writes:
- | |> We just installed a R4000 upgrade in an Indigo and after running an in house application
- | |> that is completely cpu bound I found that it was only 1.8 times faster on the Indigo
- | |> than on a R3000 4D/35. Now, both have the same amount of memory and the application
- | |> doesn't use that much. Is this right? Do I need to recompile on the R4000 with some
- | |> magic compiler option (the code is in kr C)? I'm using the:
- | |>
- | |> -cckr -float -O2
- | |>
- | |> command line options.
- | |>
- | |> Brent
- |
- |
- | Sounds about right to me...
-
- This I more or less agree with.
-
- | Remember - even though the CPU is about 3-4 times faster, other factors come
- | into play.
- |
- | Examples:
- |
- | Main memory is approximately the same speed
-
- Nope, the R4K memory system is much faster, and 64 bits wide vs 32
- (datapath to memory).
-
- | Primary caches are smaller
-
- True, and this can offset some of the other gains.
-
- | This means that loads/stores may not speed up if you have little locality.
-
- They should, particularly stores, because the cache is so much larger,
- and writeback vs writethrough. Of course, a *lot* depends on the
- data access patterns.
-
- | Also - you imply that you have a lot of floating-point.
- | Note that the latency of an FP add only changes from ~55ns to 40ns.
- | double prec. FP mult changes from ~140ns to 80ns.
-
- And it is this that indicates there is a good chance of -mips2 really
- helping (as others indicated), since it does 64 bit loads and stores
- of double precisions flots, rather than a pair of 32 bit loads/stores.
-
- Note that you need the 3.10 compilers (4.1 IDO) to use -mips2.
- --
- Let no one tell me that silence gives consent, | Dave Olson
- because whoever is silent dissents. | Silicon Graphics, Inc.
- Maria Isabel Barreno | olson@sgi.com
-