home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!olivea!sgigate!odin!fido!knobi.munich.sgi.com!knobi
- From: knobi@knobi.munich.sgi.com (Martin Knoblauch)
- Newsgroups: comp.sys.sgi
- Subject: Re: Indigo R4000 vs. R3000 4D/35
- Message-ID: <1ijjo9INN6mf@fido.asd.sgi.com>
- Date: 8 Jan 93 10:07:05 GMT
- References: <1993Jan7.142834.7127@jarvis.csri.toronto.edu>
- Organization: Silicon Graphics, Inc.
- Lines: 79
- NNTP-Posting-Host: knobi.munich.sgi.com
-
- In article <1993Jan7.142834.7127@jarvis.csri.toronto.edu>,
- corkum@csri.toronto.edu (Brent Thomas Corkum) writes:
- |> We just installed a R4000 upgrade in an Indigo and after running an
- |> in house application
- |> that is completely cpu bound I found that it was only 1.8 times
- |> faster on the Indigo
- |> than on a R3000 4D/35. Now, both have the same amount of memory and
- |> the application
- |> doesn't use that much. Is this right? Do I need to recompile on the
- |> R4000 with some
- |> magic compiler option (the code is in kr C)? I'm using the:
- |>
- |> -cckr -float -O2
- |>
- |> command line options.
- |>
- |> Brent
- Brent,
-
- first of all, I would say that a factor of 1.8 between a 50MHz R4K
- and a 36MHZ R3K is not bad at all for an image that has not been
- specially optimized for the R4K. Remember that you should look at
- your computer as a whole system and not just the CPU. IO speed and
- memory access are very important and usually do not double in
- performance when you double the performance of your CPU.
-
- When you compare the different published performance numbers for the
- R4K Indigo and the 4D/35 you will find the following factors:
-
- MIPS (R4K vs. 4D/35): 85 / 33 = 2.58
- MFLOPS: 16 / 6 = 2.67
- Specmarkcs (89/92 ?): 70 / 31 = 2.26
-
- That means your speedup of 1.8 is between 67% (FPU) and 80% (Spec ==
- "real application") of what you can achieve in the "best" case. Not to
- bad. You did not mention which compiler version you are using. With
- IRIX 4.0.5 there are two flavours of the SGI/MIPS compilers: 2.20 and
- 3.10. The 3.10 compilers yield better results and were in fact used to
- produce the published specmark results. If you have the 2.20 version
- and want to get the 3.10 version, contact your local SGI office.
-
- To get more speedup you can try a few things:
-
- a) use '-mips2'. This makes use of special hardware instructuions
- of the R4K processor (double word (8 byte) load/store, sqrt) and
- can give some more speed, especially if you use a lot of sqrt
- operations, or move a lot of double variables around.
- b) The primary cache of the R4K (8K+8K) is considerably smaller than
- the cache of the 4D/35 (64K+64K). You should analyze your program
- for its CPU usage pattern and look for code where you can
- improve the usage of the cache. Usually in the inner loop of
- the algorithm. You should use a small amount of data as often
- as possible, opposed to the 'vector strategy', where you try to
- use a small peace of code on as much data as possible. Usually
- programs that run excellent on Crays or other vector-supercomputers
- need optimization for cache based machines.
- c) to help you in b), you might want to try the '-sopt' options to
- f77 and cc (3.10 only). These options invoke a special source code
- optimizer that can help you in things like 'loop unrolling' and
- 'blocking', techniques that usually improve the performance of
- cache based architectures. Be careful. The optimizer rearranges
- your source code. The results are not always correct. Usually
- you should apply it only to really CPU critical parts of your
- program, not to the whole source.
-
- I hope this helps to see your speedup of 1.8 in the right
- perspective.
-
- Martin
- --
- +---------------------------------+----------------------------------+
- |Martin Knoblauch | Silicon Graphics GmbH |
- |Application Support | Am Hochacker 3 - Technopark |
- |Silicon Graphics Computer Systems| W-8011 Grasbrunn-Neukeferloh, FRG|
- | | Phone: (+int) 89 46108-179 or -0 |
- | | Fax: (+int) 89 46108-222 |
- +---------------------------------+----------------------------------+
- |Network: <knobi@sgi.com> | V-Mail: 8935 | M/S: IMU-315 |
- +--------------------------------------------------------------------+
-