home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!caen!sdd.hp.com!mips!mash
- From: mash@mips.com (John Mashey)
- Newsgroups: comp.sys.sgi
- Subject: Re: R4000 compiler directive, is there one ???
- Date: 13 Aug 1992 16:39:05 GMT
- Organization: MIPS Computer Systems, Inc.
- Lines: 80
- Message-ID: <l8l419INNj3e@spim.mips.com>
- References: <ob1oc18@zuni.esd.sgi.com> <1992Aug12.195518.12380@megatek.uucp> <ogn3in4@zuni.esd.sgi.com>
- NNTP-Posting-Host: winchester.mips.com
-
- In article <ogn3in4@zuni.esd.sgi.com> olson@anchor.esd.sgi.com (Dave Olson) writes:
- >| I seem to remember seeing in some version or other of the compiler
- >| (sorry, I don't know if it was beta, release, or what) that there
- >| was a switch -r4000. The switch had no effect on the output code,
- >| but was used by the instruction scheduler in the assembler.
- >| To check to see if you have it (other then just trying it) you can
- >| do a strings /usr/bin/cc|grep 4000 and see what shows up.
-
- >It is passed on to the assembler untouched, and the comments there
- >lead me to believe your recollection about instruction scheduling
- >is correct, but I don't know enough about compilers to know what
- >exactly (if anything!) it is doing.
-
- As I noted earlier, the compilersthemselves typically know little or
- nothing about the specific chip type they are using, and they generate
- some pseudo-ops to the assembler, which then converts them to the best code
- for the specific chip if requested. Anyway, there are 2 orthogonal issues,
- neither of which matters much to people producing code to run on every
- machine, as they should avoid these features. people seeking max performance
- on a specific machine, or else doing kernel work, shared library work where
- the code can be tuned, have some interest in this.
-
- -mips2 says that the assembler should use the -mips2 instructions, such
- as load/store double floating, square root, branch-likely, etc.
- These instructions are supported by R4000s *and* R6000s.
- If any such instructions are used, the resulting code will not generally
- run on R3000s. (I say, not generally, because occasionally, there are
- binaries that see which CPU they are on, and do the right thing. For example
- a sqrt function can dynamically decide to use the old code or else
- just use the sqrt instruction.)
-
- A flag (like -r4000) that specifies the chip does *not* select -r4000-specific
- instructions, but simply says to tune the scheduling of instructions
- for best code on an r4000. Quite often, there are minor differences in
- latencies, repeat rates, etc, especially in floating point, where
- the scheduler can do better. Unless I've missed something, such
- code, for example, would work fine on an R6000 as well.
-
- In a related topic, -mips3, whenever it becomes available, will only run on
- R4000s and later chips (R4000A, R5000, TFP, VRX, T5) as it uses the
- 64-bit model. Note that there are indeed some algorithms (as Dave Olson
- mentioned) whose performance is improved by 64-bit integers; in particular,
- anything dominated by logical operations on long bit vectors, multiprecision
- integer multiplies and divides, etc, will be improved.
-
- (September 1991 BYTE has an article I wrote on this whole topic, rather
- than repeat it here).
-
- Programs that are perfectly happy in 32-bit mode, when recompiled into
- 64-bit mode without change, will more often get slower than faster....
- because:
- 1) Some 32-bit multiplies/divides turn into 64-bit ones (slower).
- 2) There's more pressure on the data cache, as pointers, longs,
- and register save/restores use more space. this won't bother some
- programs at all, others will have highe data cache miss rates.
-
- Of course, the compelling reason for this technology is that some programs
- are *not* happy with 32-bits. In particular:
- 1) Some ECAD chip simulations are pressing on our existing 2GB
- limit, and will rapidly go >4GB virtual in the next few years.
- (One phase of R4000 simulation takes 1.5GB ... for 1.3M transistors.
- Want to guess what a 6M transistor chip might take? Want to guess
- how much fun it is to try to split these things up? People talk about
- the huge new chips that the industry is going to build.... but only
- if they can be properly simulated first...)
- 2) DBMS
- 3) MCAD
- 4) G.I.S.
- 5) Scientific codes
- 6) Some financial modeling
-
- It is guaranteed that the same day people get -mips3, there will be
- FORTRAN programs whose array sizes get drastically expanded, yielding
- 8GB of virtual memory-sized programs ... and trouble reports that
- these seem to page a bit on their Indigos, and why are they slow? :-)
- --
- -john mashey DISCLAIMER: <generic disclaimer, I speak for me only, etc>
- UUCP: mash@mips.com [soon to be mash@sgi.com, but not quite moved yet].
- DDD: 408-524-7015, or 524-8253
- USPS: (soon) Silicon Graphics, 2011 N. Shoreline Blvd, Mountain View, CA 94043
-