Consider the case when you
- Create a FFTW3 plan and use the plan for sequential DFT computation on each thread in your parallel region
- Use Intel Math Kernal Library (Intel MKL) FFTW3 wrappers
- Want the best performance
Intel MKL FFTW3 wrappers are thread safe by default. However, you should set one additional Intel MKL variable to get the best performance with Intel MKL. Set the number_of_users_threads variable as described below.
/* Added for Intel MKL wrappers to set number of user threads */
/*nthreads -- number of threads sharing the same plan; should be set before the plan is created*/
fftw3_mkl.number_of_user_threads = nthreads;
plan = fftw_plan_dft(...);
1. Fortran programs should declare use of the global structure declared in the mkl/include/fftw/fftw3_mkl.h (your compiler should support the BIND statement):
!DIR$ ATTRIBUTES ALIGN : 8 :: fftw3_mkl
INTEGER*4 :: ignore, mkl_dft_number_of_user_threads, ignore2
BIND (c) :: /fftw3_mkl/
2. After the declaration, the number of threads that are supposed to concurrently share an FFTW plan should be set before the plan is created with any of *fftw_plan_* functions:
mkl_dft_number_of_user_threads = nthreads
The attached examples demonstrate setting number_of_user_threads in both C and Fortran.
Note that this hint is applicable for FFTW3 wrappers only, not for FFTW2 wrappers. To get performance advantage with FFTW2 wrappers you should create a plan for each thread separately.