FFT interface versus DFT

FFT interface versus DFT

Hello everybody !

I'm evaluating the MKL library, and I'm wondering if there is any performance benefits from using the FFT interface for 'power of two' arrays instead of using the advanced DFT interface.

The manual advice the use of the second one, but I couldn't find any performance related references.

Thank you,

6 post / 0 nuovi
Ultimo contenuto
Per informazioni complete sulle ottimizzazioni del compilatore, consultare l'Avviso sull'ottimizzazione
Ritratto di Community Admin

Underneath both the legacy FFT and the new DFT use the same kernels. However, that will not necessarily be true in the future. Except for bugs, the FFTs will not likely be updated to use new processor architecture features. It is likely that in time performance differences will develop between the two sets of functions, with the DFT running better.

Of course there are many advantages using the newer interface since there are so many additional features with it. Just a few of these include the ability to perform transforms on submatrices, setting your own scaling factor, performing not-in-place transforms, and so on. These additional features can not only make life easier for the developer, they can also improve performance by reducing the number of operations that have to be performed on the data outside of the transforms.

Thank you !

I hoped this will be the case - just wanted to make sure.

I'll try a second question on the DFT performance (I'm wondering if I should start another topic):

Running some benchmarks I realized that MKL DFT spent no time on what I called the preparation phase - that is calling DftiCreateDescriptor / DftiSetValue / DftiCommitDescriptor.

Other DFT libraries usually perform some computation on the preparation phase (ex. computing twiddling factors), to gain performance for the transformation.

If I want to apply the same transformation (same dimensions, scale and sign) couples of times on different data, it is reasonable to wait a bit more at the beginning if this improves the overall performances.

So I'm wandering if there is any steps I missed that will reduce the computation time during DftiCompute* routines. As an example, I tried to compute a 2D, 1/N scaled, forward transformation. Here are the steps I followed:
One time - preparation phase:
DftiSetValue for DFTI_FORWARD_SCALE -> 1.0/n1/n2

Many times - execution phase:


Ritratto di Community Admin

Certainly with the older FFTs (radix-2) the user had responsibility to create the twiddle factor table to be used by the 1D transforms. (You had to call the function with the dimension of the transform, provide a vector for the twiddle factors and then call the transform with the supplied twiddle factor table.) On 2D transforms, we computed the twiddle factors at run time. Of course there are two reasons for this. First, it was a more complex arrangement because of the two dimensionality of the problem, and the relative cost of computing the factors was much lower since they would be reused m and n times.

I believe we still follow that practice with the DFTs. However, I will specifically check on this and follow up on this question.

Thank you ! I really appreciate this.


Ritratto di Community Admin


The deveoper says that when you commit, the twiddle factors are created and reused on all subsequent calls for that set of DFT parameters. Here, specifically was the response I got:

Twiddle factors are created once on initialization stage (in DftiCreateDescriptor() and DftiCommitDescriptor() ).

After that user can call the computation functions with this descriptor many time.

Twiddle factors are destroyed in DftiFreeDescriptor().


Accedere per lasciare un commento.