home *** CD-ROM | disk | FTP | other *** search
/ NetNews Usenet Archive 1992 #26 / NN_1992_26.iso / spool / comp / arch / 10498 < prev    next >
Encoding:
Internet Message Format  |  1992-11-08  |  1.5 KB

  1. Path: sparky!uunet!idacrd!desj@ccr-p.ida.org
  2. From: desj@ccr-p.ida.org (David desJardins)
  3. Newsgroups: comp.arch
  4. Subject: Re: Hardware Support for Numeric Algorithms
  5. Message-ID: <1739@idacrd.UUCP>
  6. Date: 7 Nov 92 06:59:09 GMT
  7. References: <1992Nov4.183718.5242@newshost.lanl.gov> <2230@titccy.cc.titech.ac.jp> <1992Nov6.181826.29015@newshost.lanl.gov>
  8. Sender: news@idacrd.UUCP
  9. Followup-To: comp.programming
  10. Organization: IDA Center for Communications Research, Princeton
  11. Lines: 23
  12.  
  13. J. Giles <jlg@cochiti.lanl.gov> writes:
  14. >>>     for(i=0;i<N;i+=4)
  15. >>>     {    b0=b[i]; b1=b[i+1]; b2=b[i+2]; b3=b[i+3];
  16. >>>         c0=c[i]; c1=c[i+1]; c2=c[i+2]; c3=c[i+3];
  17. >>>         a0=a[i]; a1=a[i+1]; a2=a[i+2]; a3=a[i+3];
  18. >>>         a[i]=a0+b0*c0; ...
  19.  
  20. > And the code still does not parallelize.  Why?  An ANSI C
  21. > implementation must *infer* which things can be parallel and which
  22. > can't because the C language doesn't have explicit vector constructs.
  23. > There is *NO* *HAND* optimization of the above code which will remendy
  24. > this - without using some non-standard extension for explicitly
  25. > declaring whether the variables are aliased or not.
  26.  
  27. You should read before you reply.  The point of the above transformation
  28. is that there is no aliasing problem, and the compiler can know without
  29. any extensions that parallel execution is valid.  Executing the four
  30. multiply-adds in parallel is valid regardless of any overlap between a,
  31. b, and c.
  32.  
  33. Followups to someplace that isn't comp.arch.
  34.  
  35.                                         David desJardins
  36.