PC-Test Pro

home *** CD-ROM | disk | FTP | other *** search

/ PC-Test Pro / PCTESTPRO.iso / benchmrk / fft / entp / readme.txt < prev

Wrap

Text File | 1995-07-31 | 4.8 KB | 137 lines

Document: FFT.ZIP File Group: General Resource Creation Date: 07/27/95 Revision Date: 07/31/95 Title: FFT Processor Benchmark, Scott Taylor Keywords: BENCHMARK PERFORMANCE FFT PROCESSOR CPU TAYLOR Abstract: FFT is a common benchmark used for digital signal processors. It allows comparisons of relative performance between similar and entirely unlike platforms (CPU types). The FFT routine uses many instructions including indirect addressing and floating point plus test branch at the end of the loops. Submitted by: Scott D. Taylor, DSP Systems Inc, 75022,3707 Revisions: 07/31/95 Added "Unable to Allocate" error definition ========================================================================= Performing an FFT is a common benchmark used for digital signal processors. It allows comparisons of relative performance between similar and entirely unlike platforms (CPU types). The FFT routine uses many instructions including indirect addressing and floating point plus test brach at the end of the loops. The number of floating point calculations performed can be determined by the following equation: T * n/2 * 10 n is the number of points in the FFT, which must be a power of 2. T is the number to which 2 is raised to equal n, ie. 2**T = n. The *10 is because the floating point calculation is a complex number which takes 6 of the 10 and then the two adds and subtracts which adds four more. So, for a 1024 point FFT, there are 10 * 1024/2 * 10 = 51200 floating point instructions executed. There is also one indirect addressing operation involved in the fetch of each floating point number in an array, of which 4 are involved each time through the loop plus hte loop overhead and the local variables. The over all result is that there are more integer instructions executed, but there is a high degree of floating point, probably 30-40% depending upon the compiler and CPU instruction set. Also, in order to get the FFT to fit in the 640k DOS allows, some changes have been made to the standard algorithm. A SIN and COS are done in the rapid execution loop. If an error message of the nature "Unable to allocate ..." or something similar occurs, taht is an indication that there is not enough free memory to run the entire suite of tests. The results from teh portion that does finish are accurate however. Use MEM at the DOS prompt to determine how much free memory there is. The program requires around 612kB. Using DOS 6.2 and the supplied Config.Sys file there should be sufficient memory to run the program. If special drivers for disk drives of video adapters are required, running from a standalone bootable floppy may help. The FFT program has been optimized for size to allow the largest possible array size to be used. As a result the table is generated by using a batch file to pass the parameters to the program. A sample batch file has been included as well as a Config.sys file which provides enough memory for the program to run. Most systems can run with up to a 32k point FFT without modifying the Config.Sys. The program will run under Windows NT in a DOS window. It will probably run under OS/2 but has not been tested. Sample batch file: fft 5 2048 h >result.txt fft 6 1024 >>result.txt fft 7 512 >>result.txt fft 8 256 >>result.txt fft 9 128 >>result.txt fft 10 64 >>result.txt fft 11 32 >>result.txt fft 12 16 >>result.txt fft 13 8 >>result.txt fft 14 4 >>result.txt fft 15 2 >>result.txt fft 16 1 >>result.txt Config.Sys for DOS 6.2. A similar file must be used with other versions. DEVICE=C:\DOS\HIMEM.SYS DEVICE=C:\DOS\EMM386.EXE AUTO dos=UMB dos=HIGH For those interested here is the loop doing 90% of the operations: d = 1; index = 1; while (index <= t) { d2 = d + d; for (j=0; j<d; j++) { REAL32 temp_real, temp_imag; INT k, jk, jkd, jnd2; k = 0; jnd2 = (j * n) / d2; while (k < n) { jk = j + k; jkd = jk + d; [a SIN and COS call go here (I do not have actual code)] exp_real = SIN(...) exp_imag = COS(...) temp_real = (kreal[jkd] * exp_real) - (kimag[jkd] * exp_imag); temp_imag = (kimag[jkd] * exp_real) + (kreal[jkd] * exp_imag); kreal[jkd] = kreal[jk] - temp_real; kimag[jkd] = kimag[jk] - temp_imag; kreal[jk] = kreal[jk] + temp_real; kimag[jk] = kimag[jk] + temp_imag; k += d2; } } index++; d = d2; Comments and questions can be forwarded to... Scott D. Taylor DSP Systems, Inc. CIS 75022,3707