NetNews Usenet Archive 1992 #16

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #16 / NN_1992_16.iso / spool / comp / sys / super / 872 < prev next >

Wrap

Internet Message Format | 1992-07-30 | 1.7 KB

Xref: sparky comp.sys.super:872 comp.lang.fortran:2894 Path: sparky!uunet!sun-barr!olivea!decwrl!pa.dec.com!e2big.mko.dec.com!quark.enet.dec.com!lionel From: lionel@quark.enet.dec.com (Steve Lionel) Newsgroups: comp.sys.super,comp.lang.fortran Subject: Re: Inner product / AXPY performance Message-ID: <1992Jul30.204909.24230@e2big.mko.dec.com> Date: 30 Jul 92 23:36:50 GMT References: <l7gi3fINNmsd@utkcs2.cs.utk.edu> Sender: guest@e2big.mko.dec.com (Guest (DECnet)) Organization: Digital Equipment Corporation Lines: 34 In article <l7gi3fINNmsd@utkcs2.cs.utk.edu>, eijkhout@cupid.cs.utk.edu (Victor Eijkhout) writes... > >I would like to get an idea of the difference in performance >between inner products > > do i=1,n x = x + a(i)*b(i) > >and axpy operations > > do i=1,n x(i) = x(i) + a*b(i) > >which both have the same number of operations, but the inner product >has an accumulation, which traditionally seems to be an >unvectorizable idea. > I tried this with VAX FORTRAN-HPO; as long as one uses the /ASSUME=NOACCURACY_SENSITIVE qualifier so that the dot product's reduction transformation can be performed (the default is ACCURACY_SENSITIVE which disables transformations that could yield different results than scalar execution), both forms vectorize very nicely. (The dot product form, of course, has a final reduction step that the "axpy" form doesn't need.) The actual vector mul-add sequences are essentially the same between the two. Of course, one can also use the BLAS SDOT and SAXPY intrinsics, which VAX FORTRAN-HPO will expand and vectorize (and parallelize, if you like.) Steve Lionel lionel@quark.enet.dec.com SDT Languages Group Digital Equipment Corporation 110 Spit Brook Road Nashua, NH 03062