IIR = Infinite Impulse Response
LMS = Least Means Square Adaptive FIR
FFT = Fast Fourier Transform
DFT = Discrete Fourier Transform
DCT = Discrete Cosine Transform
FFTInitAlloc_* allocates and initializes an FFT “specification structure” to contain the tables needed for executing an optimized Fourier transform and sets the variable pFFTSpec to point to this “spec” structure.
- The memory containing these tables is dynamically allocated and must be freed by ippsFFTFree-R_32f.
- There are numerous flavors of the initialization function. This version deals with FFTs whose input array is real (hence the “_R” and 32-bit floating point (“_32f”). For each there is a corresponding function to release the memory.
- The same “spec” structure is used for both forward and inverse transforms.
- The variable “order” is the log-base2 of the signal length. FFT in IPP is defined only for powers of two, so this order is an integer. The DFT function group (shown in a later slide) handles lengths that are not powers of two.
- Since the order of the data and the operation is embedded in the pFFTSpec structure, it will not later be passed into the function ippsFFTFwd or ippsFFTInv.
- The last argument advises the library whether to use the fastest, most accurate, or best overall version. The effect of this flag is platform-dependent.
A forward FFT is performed by ippsFFTFwd_RToCCS_32f().
- For simplicity, 0 is passed in as the temporary buffer space, which tells the function to allocate the buffer (a NULL pointer should be used). Optimized code should allocate a single buffer of a size defined by ippsFFTGetBufSize() and keep that buffer and the FFTSpec structure through multiple calls to the FFT function, in order to avoid multiple memory allocations which are very expensive calls. In this example the buffer is allocated and freed by FFTFwd and the “spec” structure is allocated and freed by myFFT_RT0C().
Also, DFT functions will use the FFT implementation if the signal length is a power of two. The only advantage to using the FFT implementation directly is the removal of a single branch to accommodate that one optimization.
If code size is importance the FFT functions are smaller.
The ippg domain (“gen” domain) overlaps many of these functions with “machine built unrolled” functions. These implementations are generally the fastest at the expense of substantially larger code size.
- For simplicity, 0 is passed in as the temporary buffer space, which tells ippsDFTFwd*() to allocate the buffer (a NULL pointer should be used). Optimized code should allocate a single buffer of a size defined by ippsDFTGetBufSize() and keep that buffer and the DFTSpec structure through multiple calls to the DFT function, in order to avoid multiple memory allocations which are very expensive. In this example the buffer is allocated and freed by DFTFwd and the “spec” structure is allocated and freed by myDFT_RT0C().
- “taps” defines the tap coefficients to be used in the FIR filter.
- “delayLine” represents the inputs, or historical samples.
- ippsFIRInitAlloc() initializes the filter.
- ippsFIR() performs the actual filtering on the input data.
ippsFIR() performs “len” iterations of filtering of the source and places the results in the destination. During each iteration, one value is taken from pSrc[n] and placed in the delay line. Then the dot product of the filter and the delay line is taken and the result written to pDst[n]. pFIRState holds the last “tapslen” samples in its internal delay line (in this case three samples).
Product and Performance Information
Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.