I'm evaluating the MKL 7.0 library. My project is in MS VC6.
I've replaced my own matrix-matrix multiplication (used in neural net computatation) by SGEMM, inourDLL (it's a SDK). On a P4, I've a gain of nearly 30% :-)
As it's a SDK, its user can choose to create many threads, eachusing one of our object. I can't also make any assumptions about the Multithread library the User will choose (Win32, OMP...)
I've seen that there is a default limit of 32 threads in SGEMM (controlled by the ENV variable KMP_ALL_THREADS).
I've raised this limit to 64 with a 'putenv' in the initialization portion of my DLL.
But I wonder: Does setting this limit higher (let's 64 or 128 at most) cause a performance penalty,whenthat many threads are not required. Let's say I create 16 threads, will changing the KMP_ALL_THREADS to 128 cause any harm compared to leaving to 32 ?
Also, in the case of a multi-CPU, what are the best MKL/KMP/OMP env. variables settings ? Again, I can't make any supposition on the final multithreading library chosen by our user.