NetNews Usenet Archive 1993 #3

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1993 #3 / NN_1993_3.iso / spool / comp / sys / super / 1186 < prev next >

Wrap

Text File | 1993-01-21 | 2.9 KB | 62 lines

Newsgroups: comp.sys.super Path: sparky!uunet!stanford.edu!eos!data.nas.nasa.gov!wilbur.nas.nasa.gov!fineberg From: fineberg@wilbur.nas.nasa.gov (Samuel A. Fineberg) Subject: Re: World's Most Powerful Computing Sites References: <1993Jan20.232809.29241@nas.nasa.gov> <1993Jan21.165159.10149@meiko.com> Sender: news@nas.nasa.gov (News Administrator) Organization: CSC, NASA Ames Research Center, NAS Division Date: Fri, 22 Jan 93 01:58:27 GMT Message-ID: <1993Jan22.015827.26653@nas.nasa.gov> Reply-To: fineberg@nas.nasa.gov Lines: 49 In article <1993Jan21.165159.10149@meiko.com>, richard@meiko.com (Richard Cownie) writes: |> fineberg@wilbur.nas.nasa.gov (Samuel A. Fineberg) writes: |> : In article <1993Jan20.211032.11929@hubcap.clemson.edu>, richard@meiko.com (Richard Cownie) writes: |> : |> This seems to imply a figure of over 4MFLOPS per T800. Last time |> : |> I programmed one, it was a real struggle to achieve over 1MFLOPS |> : |> even for inner loops of vector routines. Not that 4MFLOPS is |> : |> necessarily a lie, you might manage it if you're adding two 32-bit |> : |> zeros (and never storing the result). But it's certainly stretching |> : |> the truth a long way: there are lies, damned lies, and statistics. |> : |> |> : |> -- |> : |> Richard Cownie (a.k.a. Tich), Meiko Scientific Corp |> : |> email: richard@meiko.com |> : |> phone: 617-890-7676 |> : |> fax: 617-890-5042 |> : |> |> : Its certainly no less realistic than those for the i860. |> : |> : Sam |> |> I have to disagree with you there. I know of *some* applications where |> the i860 can achieve a good fraction of claimed peak speed, e.g. on |> a double-precision matrix multiply you can do over 35MFLOPS, against |> a peak rate claimed as 40MFLOPS (or sometimes 60MFLOPS, because you can do |> 2 adds for each multiply). In any case, it's well over 50% of peak. |> |> If a T800 transputer can achieve 50% of 4.4MFLOPS on a matrix multiply, |> or indeed *anything* useful, I'd be interested to hear about it. |> |> Performance on big compiled Fortran programs is another kettle of fish, |> and here I'd agree that peak performance figures are not much help. |> -- |> Richard Cownie (a.k.a. Tich), Meiko Scientific Corp |> email: richard@meiko.com |> phone: 617-890-7676 |> fax: 617-890-5042 I don't know too many people that write assembly code, and that is what you need to do to get 35 MFLOPs. As far as I'm concerned, assembly coded benchnmarks are useless. And if you can't get more than 60% of peak on an assembly coded matrix multiply, that is bad (when Intel quotes peak speeds on its system it uses the 60MFLOPs number, or 75 for the Paragon). The best I have ever seen on an i860 was 10-15Mflops, and that was because the compiler had stuck in some special vector subroutines in my code where it recognized a daxpy operation. I don't know what the transputer is capable of, but I would be surprised if it can't do 75-90% of peak for a useless assembly coded benchmark. Sam