home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.arch
- Path: sparky!uunet!charon.amdahl.com!pacbell.com!ames!saimiri.primate.wisc.edu!zaphod.mps.ohio-state.edu!caen!hellgate.utah.edu!lanl!cochiti.lanl.gov!jlg
- From: jlg@cochiti.lanl.gov (J. Giles)
- Subject: Re: Hardware Support for Numeric Algorithms
- Message-ID: <1992Nov6.181826.29015@newshost.lanl.gov>
- Sender: news@newshost.lanl.gov
- Organization: Los Alamos National Laboratory
- References: <BwIwEB.J1A@mentor.cc.purdue.edu> <BwJ1rB.pz@rice.edu> <1992Oct22.164414.12708@newshost.lanl.gov> <Bx78zu.395@rice.edu> <1992Nov4.183718.5242@newshost.lanl.gov> <2230@titccy.cc.titech.ac.jp>
- Date: Fri, 6 Nov 1992 18:18:26 GMT
- Lines: 57
-
- In article <2230@titccy.cc.titech.ac.jp>, mohta@necom830.cc.titech.ac.jp (Masataka Ohta) writes:
- |> In article <1992Nov4.183718.5242@newshost.lanl.gov> jlg@cochiti.lanl.gov (J. Giles) writes:
- |> [...]
- |> >The rationale document for ANSI C explicitly
- |> >recognizes the optimization penalty inherent in pointers and suggests
- |> >that remedies to this be a priority in future versions of the standard.
- |> >Yes, if avoid procedure calls (at least, all those whose arguments
- |> >are pointers),
- |>
- |> Automatic optimization by compilers have little to do with the above
- |> mentioned *HAND* optimization.
- |>
- |> You can transform the following program
- |>
- |> f(a,b,c)
- |> double *a,*b,*c;
- |>
- |> for(i=0;i<4*N;i++)
- |> { a[i]+=b[i]*c[i];
- |> ...
- |>
- |> to
- |>
- |> for(i=0;i<N;i+=4)
- |> { b0=b[i]; b1=b[i+1]; b2=b[i+2]; b3=b[i+3];
- |> c0=c[i]; c1=c[i+1]; c2=c[i+2]; c3=c[i+3];
- |> a0=a[i]; a1=a[i+1]; a2=a[i+2]; a3=a[i+3];
- |> a[i]=a0+b0*c0; ...
- |>
- |> by hand, knowing that the area for a, b and c does not overlap.
-
- And the code still does not parallelize. Why? An ANSI C implementation
- must *infer* which things can be parallel and which can't because the
- C language doesn't have explicit vector constructs. There is *NO*
- *HAND* optimization of the above code which will remendy this - without
- using some non-standard extension for explicitly declaring whether
- the variables are aliased or not.
-
- Yes, if you have a primitive, non-pipelined, non-vector, non-parallel,
- purely scalar machine, then perhaps careful hand coding can produce
- as good a result with C as with assembly. If you have such a machine,
- you obviously don't think speed is worth paying for in hardware - why
- do you think spending *more* money for programmer time to hand optimize
- is a better investment?
-
- |> [...]
- |> >It's easier to use assembly directly.
- |>
- |> Thus, it is easier to use C.
-
- Most assembly languages have more legible *syntax* than C, their
- semantics are usually cleaner and more well-defined, and they
- provide direct control, rather than indirect reliance on the
- inference capabilities of the compiler.
-
- --
- J. Giles
-