NetNews Usenet Archive 1992 #18

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #18 / NN_1992_18.iso / spool / comp / sys / sgi / 12349 < prev next >

Wrap

Internet Message Format | 1992-08-13 | 4.7 KB

Path: sparky!uunet!caen!sdd.hp.com!mips!mash From: mash@mips.com (John Mashey) Newsgroups: comp.sys.sgi Subject: Re: R4000 compiler directive, is there one ??? Date: 13 Aug 1992 16:39:05 GMT Organization: MIPS Computer Systems, Inc. Lines: 80 Message-ID: <l8l419INNj3e@spim.mips.com> References: <ob1oc18@zuni.esd.sgi.com> <1992Aug12.195518.12380@megatek.uucp> <ogn3in4@zuni.esd.sgi.com> NNTP-Posting-Host: winchester.mips.com In article <ogn3in4@zuni.esd.sgi.com> olson@anchor.esd.sgi.com (Dave Olson) writes: >| I seem to remember seeing in some version or other of the compiler >| (sorry, I don't know if it was beta, release, or what) that there >| was a switch -r4000. The switch had no effect on the output code, >| but was used by the instruction scheduler in the assembler. >| To check to see if you have it (other then just trying it) you can >| do a strings /usr/bin/cc|grep 4000 and see what shows up. >It is passed on to the assembler untouched, and the comments there >lead me to believe your recollection about instruction scheduling >is correct, but I don't know enough about compilers to know what >exactly (if anything!) it is doing. As I noted earlier, the compilersthemselves typically know little or nothing about the specific chip type they are using, and they generate some pseudo-ops to the assembler, which then converts them to the best code for the specific chip if requested. Anyway, there are 2 orthogonal issues, neither of which matters much to people producing code to run on every machine, as they should avoid these features. people seeking max performance on a specific machine, or else doing kernel work, shared library work where the code can be tuned, have some interest in this. -mips2 says that the assembler should use the -mips2 instructions, such as load/store double floating, square root, branch-likely, etc. These instructions are supported by R4000s *and* R6000s. If any such instructions are used, the resulting code will not generally run on R3000s. (I say, not generally, because occasionally, there are binaries that see which CPU they are on, and do the right thing. For example a sqrt function can dynamically decide to use the old code or else just use the sqrt instruction.) A flag (like -r4000) that specifies the chip does *not* select -r4000-specific instructions, but simply says to tune the scheduling of instructions for best code on an r4000. Quite often, there are minor differences in latencies, repeat rates, etc, especially in floating point, where the scheduler can do better. Unless I've missed something, such code, for example, would work fine on an R6000 as well. In a related topic, -mips3, whenever it becomes available, will only run on R4000s and later chips (R4000A, R5000, TFP, VRX, T5) as it uses the 64-bit model. Note that there are indeed some algorithms (as Dave Olson mentioned) whose performance is improved by 64-bit integers; in particular, anything dominated by logical operations on long bit vectors, multiprecision integer multiplies and divides, etc, will be improved. (September 1991 BYTE has an article I wrote on this whole topic, rather than repeat it here). Programs that are perfectly happy in 32-bit mode, when recompiled into 64-bit mode without change, will more often get slower than faster.... because: 1) Some 32-bit multiplies/divides turn into 64-bit ones (slower). 2) There's more pressure on the data cache, as pointers, longs, and register save/restores use more space. this won't bother some programs at all, others will have highe data cache miss rates. Of course, the compelling reason for this technology is that some programs are *not* happy with 32-bits. In particular: 1) Some ECAD chip simulations are pressing on our existing 2GB limit, and will rapidly go >4GB virtual in the next few years. (One phase of R4000 simulation takes 1.5GB ... for 1.3M transistors. Want to guess what a 6M transistor chip might take? Want to guess how much fun it is to try to split these things up? People talk about the huge new chips that the industry is going to build.... but only if they can be properly simulated first...) 2) DBMS 3) MCAD 4) G.I.S. 5) Scientific codes 6) Some financial modeling It is guaranteed that the same day people get -mips3, there will be FORTRAN programs whose array sizes get drastically expanded, yielding 8GB of virtual memory-sized programs ... and trouble reports that these seem to page a bit on their Indigos, and why are they slow? :-) -- -john mashey DISCLAIMER: <generic disclaimer, I speak for me only, etc> UUCP: mash@mips.com [soon to be mash@sgi.com, but not quite moved yet]. DDD: 408-524-7015, or 524-8253 USPS: (soon) Silicon Graphics, 2011 N. Shoreline Blvd, Mountain View, CA 94043