How to use Intel® IPP’s 1D Fourier Transform Functions

Intel® IPP provides several functions to compute the forward and reverse fast Fourier transform algorithm for real or complex data.  While there are 1D and 2D versions in the library, this article will focus on 1D.

There are two sets of functions: DFT and FFT.  For a general single interface, use DFT.  The DFT functions implement the FFT algorithm for arbitrary array sizes, including powers of 2.  The Intel® IPP FFT functions are limited to powers of 2.  Internally, the DFT functions check for a power of 2 input size and call the FFT functions for that special case.  If you only use power of 2 sizes you can skip these checks and call the FFT functions directly.

In previous versions of Intel® IPP buffers for internal state, twiddle factors, etc. were allocated internally. As a general Intel® IPP strategy, allowing your application to handle these buffers directly gives you more options and control.  Internal allocations are now deprecated.

Here is a basic description of the  external buffer  initialization sequence for Intel® IPP FFT.  This pattern is a familiar one, used for many Intel® IPP functions.

1) Query the implementation to get the sizes of required buffers and allocate them.  For FFT this is

  • A "spec" to hold configuration details, coefficients/twiddle factors, etc. 
  • A working buffer for initialization (used in init only -- can be freed immediately after)
  • A working buffer for  the FFT operation

2) Do the  init  step. This is separate from the main function so that anything that can be precalculated for a specific signal size, normalization, and accuracy hint can be done outside of a loop.  This precalculated information is referenced later through the spec structure.

3) do the FFT.  The necessary inputs are

  • Input and output buffer.  In most cases these can be the same.  That is, the destination buffer can be the source buffer for in-place operation.
  • The spec (holds anything where there is a speed advantage to precalculate).
  • The working buffer (holds any temporary values).

Example code implementing all of these steps with external allocation is below.  There are many similarities between FFT and DFT, with just a few differences to note.  One is that DFT uses a signed integer length while FFT uses a power of 2 order.   The other is that  ippsFFTInit returns an aligned pointer to the spec.  When buffers are allocated with Intel® IPP malloc, they are already aligned so this will be the same address as the original spec buffer.   

DFT: (Use this for a general implementation, power of 2 and any other size.)

    //Set the size
    const int N=128;

    // Spec and working buffers
    IppsDFTSpec_C_32fc *pDFTSpec=0;
    Ipp8u  *pDFTInitBuf, *pDFTWorkBuf;

    // Allocate complex buffers
    Ipp32fc *pSrc=ippsMalloc_32fc(N);
    Ipp32fc *pDst=ippsMalloc_32fc(N); 

    // Query to get buffer sizes
    int sizeDFTSpec,sizeDFTInitBuf,sizeDFTWorkBuf;
    ippsDFTGetSize_C_32fc(N, IPP_FFT_NODIV_BY_ANY, 
        ippAlgHintAccurate, &sizeDFTSpec, &sizeDFTInitBuf, &sizeDFTWorkBuf);

    // Alloc DFT buffers
    pDFTSpec    = (IppsDFTSpec_C_32fc*)ippsMalloc_8u(sizeDFTSpec);
    pDFTInitBuf = ippsMalloc_8u(sizeDFTInitBuf);
    pDFTWorkBuf = ippsMalloc_8u(sizeDFTWorkBuf);

    // Initialize DFT
    ippsDFTInit_C_32fc(N, IPP_FFT_NODIV_BY_ANY, 
        ippAlgHintAccurate, pDFTSpec, pDFTInitBuf);
    if (pDFTInitBuf) ippFree(pDFTInitBuf);

    // Do the DFT
    ippsDFTFwd_CToC_32fc(pSrc,pDst,pDFTSpec,pDFTWorkBuf);


    //check results
    ippsDFTInv_CToC_32fc(pDst,pDst,pDFTSpec,pDFTWorkBuf);
    int OK=1;
    for (int i=0;i<N;i++) {
        pDst[i].re/=(Ipp32f)N;
        pDst[i].im/=(Ipp32f)N;
        if ((abs(pSrc[i].re-pDst[i].re)>.001) || 
            (abs(pSrc[i].im-pDst[i].im)>.001) ) 
        {
            OK=0;break;
        }
    }
    puts(OK==1?"DFT OK":"DFT Fail");

    if (pDFTWorkBuf) ippFree(pDFTWorkBuf);
    if (pDFTSpec) ippFree(pDFTSpec);

    ippFree(pSrc);
    ippFree(pDst);
...
 

FFT: (Special power of 2 only case, skips a few checks so may be minimally faster than calling DFT.)

    //Set the size
    const int N=128;
    const int order=(int)(log((double)N)/log(2.0));

    // Spec and working buffers
    IppsFFTSpec_C_32fc *pFFTSpec=0;
    Ipp8u *pFFTSpecBuf, *pFFTInitBuf, *pFFTWorkBuf;

    // Allocate complex buffers
    Ipp32fc *pSrc=ippsMalloc_32fc(N);
    Ipp32fc *pDst=ippsMalloc_32fc(N); 

    // Query to get buffer sizes
    int sizeFFTSpec,sizeFFTInitBuf,sizeFFTWorkBuf;
    ippsFFTGetSize_C_32fc(order, IPP_FFT_NODIV_BY_ANY, 
        ippAlgHintAccurate, &sizeFFTSpec, &sizeFFTInitBuf, &sizeFFTWorkBuf);

    // Alloc FFT buffers
    pFFTSpecBuf = ippsMalloc_8u(sizeFFTSpec);
    pFFTInitBuf = ippsMalloc_8u(sizeFFTInitBuf);
    pFFTWorkBuf = ippsMalloc_8u(sizeFFTWorkBuf);

    // Initialize FFT
    ippsFFTInit_C_32fc(&pFFTSpec, order, IPP_FFT_NODIV_BY_ANY, 
        ippAlgHintAccurate, pFFTSpecBuf, pFFTInitBuf);
    if (pFFTInitBuf) ippFree(pFFTInitBuf);

    // Do the FFT
    ippsFFTFwd_CToC_32fc(pSrc,pDst,pFFTSpec,pFFTWorkBuf);


    //check results
    ippsFFTInv_CToC_32fc(pDst,pDst,pFFTSpec,pFFTWorkBuf);
    int OK=1;
    for (int i=0;i<N;i++) {
        pDst[i].re/=(Ipp32f)N;
        pDst[i].im/=(Ipp32f)N;
        if ((abs(pSrc[i].re-pDst[i].re)>.001) || 
            (abs(pSrc[i].im-pDst[i].im)>.001) ) 
        {
            OK=0;break;
        }
    }
    puts(OK==1?"FFT OK":"FFT Fail");

    if (pFFTWorkBuf) ippFree(pFFTWorkBuf);
    if (pFFTSpecBuf) ippFree(pFFTSpecBuf);

    ippFree(pSrc);
    ippFree(pDst);
...
 

 

Error/status checks are omitted above for shorter presentation.  However, especially for larger signal sizes, checking the status of the GetSize and Init steps can be important.  There is no simple formula to determine the largest supported size. 

The goal of this article is to provide a starting point for developing applications using IPP's 1D Fourier transform functions.  For more please see the IPP reference manual.  

有关编译器优化的更完整信息,请参阅优化通知