Setting number_of_user_threads for Intel® Math Kernel Library FFTW3 wrappers

Consider the case when you

  • Create a FFTW3 plan and use the plan for sequential DFT computation on each thread in your parallel region
  • Use Intel Math Kernal Library (Intel MKL) FFTW3 wrappers
  • Want the best performance

Intel MKL FFTW3 wrappers are thread safe by default. However, you should set one additional Intel MKL variable to get the best performance with Intel MKL. Set the number_of_users_threads variable as described below.

In C:

#include "fftw3.h"

/* Added for Intel MKL wrappers to set number of user threads */
#include "fftw3_mkl.h"

/*nthreads -- number of threads sharing the same plan; should be set before the plan is created*/
fftw3_mkl.number_of_user_threads = nthreads;
plan = fftw_plan_dft(...);

In Fortran:

1. Fortran programs should declare use of the global structure declared in the mkl/include/fftw/fftw3_mkl.h (your compiler should support the BIND statement):

!DIR$ ATTRIBUTES ALIGN : 8 :: fftw3_mkl
COMMON/fftw3_mkl/ignore(4),mkl_dft_number_of_user_threads,ignore2(7)
INTEGER*4 :: ignore, mkl_dft_number_of_user_threads, ignore2
BIND (c) :: /fftw3_mkl/     

2. After the declaration, the number of threads that are supposed to concurrently share an FFTW plan should be set before the plan is created with any of *fftw_plan_* functions:

mkl_dft_number_of_user_threads = nthreads

call dfftw_plan_dft_1d(...)

The attached examples demonstrate setting number_of_user_threads in both C and Fortran.

Note that this hint is applicable for FFTW3 wrappers only, not for FFTW2 wrappers. To get performance advantage with FFTW2 wrappers you should create a plan for each thread separately.

For more complete information about compiler optimizations, see our Optimization Notice.
AttachmentSize
File dp.c6.07 KB
File sp.c6.09 KB
Binary Data dp.f905.9 KB
Binary Data sp.f905.89 KB

2 comments

Top
Vinutha V (Intel)'s picture

Hi,

I have issue with respect to FFTW mkl function - which does not seem to be threaded. I have compiled it for native xeon phi.

How can I get these mkl functions(FFTW_EXECUTE_DFT_C2R and FFTW_EXECUTE_DFT_R2C) threaded?

 

Thanks,

Vinutha

Maciej O.'s picture

Thanks for the article, it would be great if you could update the MKL reference manual too as it currently says

FFTW3 wrappers are not fully thread safe. If the new-array execute functions, such as fftw_execute_dft(), share the same plan from parallel user threads, set the number of the sharing threads before creation of the plan. For this purpose, the FFTW3 wrappers provide a header file fftw3_mkl.h, which defines a global structure fftw3_mkl with a field to be set to the number of sharing threads. Below is an example of setting the number of sharing threads

Add a Comment

Have a technical question? Visit our forums. Have site or software product issues? Contact support.