BLAS call inside OMP parallel block

BLAS call inside OMP parallel block


Suppose I have a machine on which I can, say, 8 simultaneous threads. Now suppose I have 2  distinct "omp parallel" blocks. I want to use all 8 threads in both blocks.

However, one paralle block has BLAS calls. The other does not. If I have 8 threads going on these 2 blocks, when the BLAS function is reached will MKL try to parallelize the BLAS call? I'm afraid that this will slow things down because I have no more threads available in my machine. I would still like to take advantage of the MKL-BLAS speedup, but I do not want it to multi-thread because I am already multi-threading at a higher level.

How do I control this? I understand I have the environment variables OMP_NUM_THREADS and MKL_NUM_THREADS. However the BLAS are not called in every parallel block and it is my understanding that these environment variables are only read once.

Can somebody comment on how this threading can be controlled?



4 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

If you don't set _OMP_NESTED or call omp_set_nested, the default is to run in the parent thread at the lower level of nested OpenMP calls. You could call omp_set_nested to change the setting prior to an omp parallel.

Just for clarification, the default action will be the same as OMP_NESTED=0 ??


Matara Ma

Yes, by default the OMP_NESTED is disabled, if it is not set.

Leave a Comment

Please sign in to add a comment. Not a member? Join today