Slow FFT performance on OS X (but not Windows)

Slow FFT performance on OS X (but not Windows)

Greetings everybody.  I'm experiencing performance problems with MKL's FFT, but only on OS X.

I have a C++ project in use on both Windows and OS X.  Initially, I had built it to use FFTW, but several months ago, I switched to using MKL for the FFT calculations on the Windows build, still using the FFTW3 interface.  On Windows with a Core 2 Duo processor, the performance difference between MKL and FFTW is within 10%, which is fine.  I purchased a license for MKL on Windows and have been happy.

However, when the same code is built on OS X, using the eval versions of ICC and MKL, FFTs using MKL are about 2.2 times slower than when linked to FFTW (again using an Intel processor, this time an i7).  For example, when linked to FFTW, execution time is 6 seconds, and changing the linking to MKL (with no other changes), execution time is 14 seconds.

I've tried various combinations of static linking, dynamic linking, enabling threading in MKL, setting MKL to sequential mode, and so on.  (The project uses pthreads for its threading, if that matters.) The only thing that seems to make a difference is the compiler. If I use LLVM instead of ICC to build, overall performance is about 20% worse, but the difference between MKL and FFTW remains.To reiterate, the same code with the same input data runs at roughly the same speed on Windows when using either MKL or FFTW, but on OS X, with the same code and input data as on Windows, FFTs with MKL are about half as fast as with FFTW.

Anybody have any ideas why MKL's FFT would be so much slower on OS X?

Thanks!

Jeff

5 帖子 / 0 全新
最新文章
如需更全面地了解编译器优化,请参阅优化注意事项

Hi Jeff,

Could you help us to reproduce the issue you report. Specifically, could you tell what precisely is the CPU under OSX, what version of MKL do you use, and what is a representative FFT problem you compute (precision, type, rank, dimensions, placement)?

Thanks
Dima
 

Hi Dima,

Sure. The CPU is an i7-2677M at 1.8 GHz, the version of OS X is 10.7.2, the version of MKL is 10.3.11, and a representative FFT problem is: single-precision, complex, rank-2, in-place, 3000x2000 elements.

Thanks,

Jeff

Best Reply

Hi Jeff,

I am very sorry to tell you that MKL 10.3.11 for OS X have missed optimized non-2-power FFTs for AVX.
MKL 10.3.12 is fixing this issue. MKL 10.3.10 does not have this issue so severe.

Thanks
Dima

Hi Dima,Thanks for identifying the cause.  I'll watch for 10.3.12, and in the meantime I'll give 10.3.10 a try.Thanks,Jeff

发表评论

登录添加评论。还不是成员?立即加入