I am a newbie to MKL and am trying out the 10.0.011 FFT routines with gcc as my compiler. My PC is a Intel Core 2 PC, and indeed MKL detects that max threads can be 2. My test code is not threaded.
I've run FFTs ranging from 8192 points to 262144 points. When the batch size is 1, and I use mkl_set_num_threads to change the possible thread number, I do not see any performance change. I've tried 1,2 and 4 thread settings.
If I change the batch size to 2,4,8 and 16, I see better performance for the setting of 2 threads. I am not surprised by this as there are only 2 cores on my PC. However, if I monitor the CPU performance using gnome-system-monitor, I only see one core at a time being used at or close to 100%. The other CPU core very occassionally has high usage.
First, can someone tell me whether a batch size of 1 should also experience some mkl threading? From the manual I assumed that the only time that a 1D FFT of batch size 1 would not thread is if its size is not a power of 2.
Second, can you tell me why one of my CPU cores is barely being utilized? Do I need to link to the omp libs to get both cores going with mkl? I do not set any thread related env vars as I call mkl_set_num_threads directly from my code.