optimize 1D FFT performance

optimize 1D FFT performance


I am trying to apply 1D FFT to a 3D matrix along a single direction. Below is the code I am currently using. It has a nested loop to loop through the other 2 dimensions. It works but I am just wondering if there is any ways to speedup this code. The size of the FFT is typically under 1024 points.

status = DftiCreateDescriptor(hFFT,DFTI_DOUBLE,DFTI_COMPLEX,1,nFFT)
status = DftiCommitDescriptor(hFFT)

do j = 1,nz
    do i = 1,ny
        status = DftiComputeForward(hFFT,datarel(:,i,j),dataimg(:,i,j))
    end do
end do

status = DftiFreeDescriptor(hFFT)



2 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.
Best Reply


The nested loop looks ok for me.  and  as you see from  https://software.intel.com/en-us/node/433474#FFT

1024 1D complex FFt is not multithreaded.  So if you are working on mult-core machines, you may try the multi-thread the batched 1D 1024 point FFT by any methods. like in MKL userguide : 

Examples of Using Multi-Threading for FFT Computation  => Using Parallel Mode with a Common Descriptor



Best Regards,


Leave a Comment

Please sign in to add a comment. Not a member? Join today