Abstract
Modern Single Instruction Multiple Data (SIMD) microprocessor architectures allow parallel floating point operations over four contiguous elements in memory. The radix-2 FFT algorithm is well suited for modern SIMD architectures after the second stage (decimation-in-time case). In this paper, a general radix-2 FFT algorithm is developed for the modern SIMD architectures. This algorithm (SIMD-FFT) is implemented on the Intel Pentium and Motorola PowerPC architecture for 1D and 2D. The results are compared against Intel's implementation of the split-radix FFT for the SIMD architecture [2] and the FFTW [3]. Overall, the SIMD-FFT was found to be faster than the other two implementations for complex 1D input data (ranging from 95.9% up to 372%), and for complex 2D input data (ranging from 68.8% up to 343%) as well.
Original language | English |
---|---|
Pages (from-to) | III/3220-III/3223 |
Journal | ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings |
Volume | 3 |
State | Published - 2002 |
Externally published | Yes |
Event | 2002 IEEE International Conference on Acoustic, Speech, and Signal Processing - Orlando, FL, United States Duration: 13 May 2002 → 17 May 2002 |