Behaviour of omp_set_num_threads/omp_get_max_threads

Behaviour of omp_set_num_threads/omp_get_max_threads

In my main thread
omp_get_max_threads() returns 4 as expected

I call

omp_get_max_threads() now returns 1 as expected.

If I launch a separate "worker thread" and call omp_get_max_threads() inside that thread , I get 4. This seems unexpected.

7 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

If you omp_set_num_threads in a parallel region, the change would not appear outside that region until the parallel ends.  You would need to give more detail about your expectation.

You are misunderstanding what I am doing (This is on Windows)

Here is a skeleton of my code. Assume OMP_NUM_THREADS=4 is set in the environment, 4 core machine.

DWORD ThreadProc(LPVOID lpThreadParameter)
int n = omp_get_max_threads() ;<---this returns 4. Not what I would expect.
// do some omp stuff....

int n = omp_get_max_threads() ;<---this returns 4
int n = omp_get_max_threads() ;<---this returns 1... as expected
CreateThread(NULL,0,ThreadProc,....); <--- start a new thread.

I found that I have had to modify my code to call omp_set_num_threads(1) again inside the ThreadProc()

If I were to guess, OpenMP uses thread local storage for "num_threads". Each thread can set this to whatever they want. The thread initialization code apparently obtains the initial copy from the environment (just as the main thread does/would). Also, I assume, the thread local storage contains a one-shot flag to indicate the omp_... routines are to initialize the omp portion of the TLS for use by OpenMP.

What you need to do then is



	   int myMaxThreads = 1; // whatever

	   #pragma omp parallel






Yup, that's what I figured. Still it's a trap that's easy enough to fall into.



Another "trap" for the newbie is omp_get_thread_num() returns the team member number for the current parallel region (not necessarily a globally unique number). IIF you use nested parallelism each new team starts out with  omp_get_thread_num() == 0, then 1, 2, ... for each additional team member. IOW, depending on context running, you may have several (many) threads having the same omp_get_thread_num()'s.

Jim Dempsey

I would add a couple of comments on the issue.

According to the OpenMP specification the behavior of an OpenMP program is controlled by internal control variables (ICVs). Many of them including nthreads-var are task-specific (or in other words there is one copy per data environment). Changing the value in one task does not affect other OpenMP tasks.

Next, the CreateThread routine has nothing with OpenMP, so the OpenMP task initialized in newly created thread has no idea of the origin of the thread - who when and where created this thread. Thus initial implicit task is initialized using default set of ICVs, that was not affected by the call to omp_set_num_threads in one of existed tasks. As opposed to this, when parallel region is encountered, new threads are created by OpenMP implementation, and their ICVs are inherited from the parent task according to rules described in the section " How the Per-Data Environment ICVs Work" of OpenMP 4.0 specification.  That is where the difference between user-created threads and OpenMP implementation-created threads comes from.

To summarize - the behavior of the Intel compiler is perfectly legal in this case.


Leave a Comment

Please sign in to add a comment. Not a member? Join today