home *** CD-ROM | disk | FTP | other *** search
- Xref: sparky comp.sys.dec:6788 comp.dsp:2963
- Newsgroups: comp.sys.dec,comp.dsp
- Path: sparky!uunet!haven.umd.edu!decuac!pa.dec.com!engage.pko.dec.com!nntpd.lkg.dec.com!news!news.crl.dec.com!payne
- From: payne@crl.dec.com (Andrew Payne)
- Subject: Re: Alpha fft performance
- Message-ID: <1993Jan10.160508.18798@crl.dec.com>
- Sender: news@crl.dec.com (USENET News System)
- Organization: DEC Cambridge Research Lab
- References: <1992Dec31.164221.27734@aplcen.apl.jhu.edu> <1993Jan4.154245.13258@crl.dec.com> <1993Jan7.121038.4845@odin.corp.sgi.com>
- Date: Sun, 10 Jan 1993 16:05:08 GMT
- Lines: 29
-
- In article <1993Jan7.121038.4845@odin.corp.sgi.com> jpp@sgi.com writes:
- > An array of 1024 complex numbers indeed uses 8 Kbytes. However to compute an
- > FFT you need an extra array of Sines and Cosines of same size (8 kBytes).
- > The total space required for this FFT is then at least 8+8 = 16 kBytes.
- > So the assumption "just fits in the on-chip 8K D cache" seems abusive. ???
-
- Not at all. As has been already pointed out, you can take advantage of
- symmetry to cut the table size down to 2K bytes (512 coefficients). Also,
- you don't use all the coefficients at each stage (except the last), so the
- _effective_ table size is even smaller.
-
- > I am not familiar with "Alpha's process cycle counter". Is this a simulator ?
- > Does this tool take in account eventual cache misses ?
- > How do real benchmark compare with your simulation ?
-
- No, it isn't a simulator. The cycle counter is just a high-resolution timer
- that counts CPU cycles, and is available on all Alpha implementations.
- Our FFT times were measured on a real system with the cycle counter. The
- times given are "wall clock times" which take into account all aspects of the
- execution, including time spent in the memory system.
-
- > Are the results of your transform ordered, or are they "bit reversed" ?
-
- The times given are for ordered results. In our code, the digit reverse
- stage takes about 10,000 cycles (about 10% of the time).
-
- --
- Andrew C. Payne
- DEC Cambridge Research Lab
-