home *** CD-ROM | disk | FTP | other *** search
- Document: FFT.ZIP
- File Group: General Resource
- Creation Date: 07/27/95
- Revision Date: 07/31/95
-
- Title: FFT Processor Benchmark, Scott Taylor
- Keywords: BENCHMARK PERFORMANCE FFT PROCESSOR CPU TAYLOR
-
- Abstract: FFT is a common benchmark used for digital signal
- processors. It allows comparisons of relative performance
- between similar and entirely unlike platforms (CPU types).
- The FFT routine uses many instructions including indirect
- addressing and floating point plus test branch at the end of
- the loops.
-
- Submitted by: Scott D. Taylor, DSP Systems Inc, 75022,3707
-
-
- Revisions: 07/31/95 Added "Unable to Allocate" error definition
-
- =========================================================================
-
- Performing an FFT is a common benchmark used for digital signal processors.
- It allows comparisons of relative performance between similar and entirely
- unlike platforms (CPU types). The FFT routine uses many instructions
- including indirect addressing and floating point plus test brach at the end
- of the loops. The number of floating point calculations performed can be
- determined by the following equation:
-
- T * n/2 * 10
-
- n is the number of points in the FFT, which must be a power of 2. T is
- the number to which 2 is raised to equal n, ie. 2**T = n. The *10 is
- because the floating point calculation is a complex number which takes 6
- of the 10 and then the two adds and subtracts which adds four more.
-
- So, for a 1024 point FFT, there are 10 * 1024/2 * 10 = 51200 floating
- point instructions executed. There is also one indirect addressing
- operation involved in the fetch of each floating point number in an array,
- of which 4 are involved each time through the loop plus hte loop overhead
- and the local variables. The over all result is that there are more
- integer instructions executed, but there is a high degree of floating
- point, probably 30-40% depending upon the compiler and CPU instruction
- set.
-
- Also, in order to get the FFT to fit in the 640k DOS allows, some changes
- have been made to the standard algorithm. A SIN and COS are done in the rapid
- execution loop. If an error message of the nature "Unable to allocate ..." or
- something similar occurs, taht is an indication that there is not enough free
- memory to run the entire suite of tests. The results from teh portion that
- does finish are accurate however. Use MEM at the DOS prompt to determine how
- much free memory there is. The program requires around 612kB. Using DOS 6.2
- and the supplied Config.Sys file there should be sufficient memory to run the
- program. If special drivers for disk drives of video adapters are required,
- running from a standalone bootable floppy may help.
-
- The FFT program has been optimized for size to allow the largest possible
- array size to be used. As a result the table is generated by using a batch
- file to pass the parameters to the program. A sample batch file has been
- included as well as a Config.sys file which provides enough memory for the
- program to run. Most systems can run with up to a 32k point FFT without
- modifying the Config.Sys. The program will run under Windows NT in a DOS
- window. It will probably run under OS/2 but has not been tested.
-
- Sample batch file:
-
- fft 5 2048 h >result.txt
- fft 6 1024 >>result.txt
- fft 7 512 >>result.txt
- fft 8 256 >>result.txt
- fft 9 128 >>result.txt
- fft 10 64 >>result.txt
- fft 11 32 >>result.txt
- fft 12 16 >>result.txt
- fft 13 8 >>result.txt
- fft 14 4 >>result.txt
- fft 15 2 >>result.txt
- fft 16 1 >>result.txt
-
-
- Config.Sys for DOS 6.2. A similar file must be used with other versions.
-
- DEVICE=C:\DOS\HIMEM.SYS
- DEVICE=C:\DOS\EMM386.EXE AUTO
- dos=UMB
- dos=HIGH
-
-
- For those interested here is the loop doing 90% of the operations:
-
- d = 1;
- index = 1;
- while (index <= t)
- {
- d2 = d + d;
- for (j=0; j<d; j++)
- {
- REAL32 temp_real, temp_imag;
- INT k, jk, jkd, jnd2;
- k = 0;
- jnd2 = (j * n) / d2;
-
- while (k < n)
- {
- jk = j + k;
- jkd = jk + d;
-
- [a SIN and COS call go here (I do not have actual code)]
- exp_real = SIN(...)
- exp_imag = COS(...)
-
- temp_real = (kreal[jkd] * exp_real) -
- (kimag[jkd] * exp_imag);
- temp_imag = (kimag[jkd] * exp_real) +
- (kreal[jkd] * exp_imag);
-
- kreal[jkd] = kreal[jk] - temp_real;
- kimag[jkd] = kimag[jk] - temp_imag;
-
- kreal[jk] = kreal[jk] + temp_real;
- kimag[jk] = kimag[jk] + temp_imag;
-
- k += d2;
- }
- }
-
- index++;
- d = d2;
-
- Comments and questions can be forwarded to...
-
- Scott D. Taylor
- DSP Systems, Inc.
- CIS 75022,3707
-
-
-