NetNews Usenet Archive 1993 #3

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1993 #3 / NN_1993_3.iso / spool / comp / sys / super / 1197 < prev next >

Wrap

Text File | 1993-01-24 | 1.6 KB | 40 lines

Newsgroups: comp.sys.super Path: sparky!uunet!pgroup!lfm From: lfm@pgroup.com (Larry Meadows) Subject: i860 performance again (was:Re: World's Most Powerful...) Message-ID: <C1C81H.A37@pgroup.com> Date: Sun, 24 Jan 1993 02:53:41 GMT Distribution: inet References: <1993Jan21.165159.10149@meiko.com> <1993Jan22.015827.26653@nas.nasa.gov> <EIJKHOUT.93Jan23164314@cupid.cs.utk.edu> Organization: The Portland Group, Portland, OR Lines: 28 In article <EIJKHOUT.93Jan23164314@cupid.cs.utk.edu> eijkhout@cupid.cs.utk.edu (Victor Eijkhout) writes: > >Assembly coded benchmarks, maybe. But the BLAS routines (Basic Linear >Algebra Subroutines) exist on many machines in assembler, and >precisely because they are 1/ standardized across all platforms >2/ optimized for each, you can write real programs in Fortran or C, >with a reasonable speed if you need the BLAS often enough. This is a good idea, but as I pointed out, you will not get more than 11 or 12 mflops on daxpy if you stick to Blas-1 (unless you have some kind of special daxpy [ maybe KAI? ] that assumes that vector 1 stays in cache and vector 2 comes from main memory, or vice versa). > >Just interface your program to assembler *kernels*. > >I know of people who write distributed memory linear system >solvers, that get over 50% performance out of *500 i860's* >using those assembler BLAS. These people must be using the Delta, huh? I don't know anyone else that has 500 i860s. Are they calling blas-2 routines? If so, I'd definitely believe you, as I know people who get 90%+ performance on 80 or so shared memory i860-XRs using similar techniques. -- Larry Meadows The Portland Group lfm@pgroup.com