Separate monitor thread

Separate monitor thread

rmf166的头像

Hello,

First off, I think I mistakenly posted this under "Open source OpenMP":

http://software.intel.com/en-us/forums/topic/497456

I am using the Intel Composer Fortran Compiler 14.0.0.

What is the purpose of the separate monitor thread OpenMP creates?

See http://software.intel.com/en-us/articles/threading-fortran-applications-...

In my Fortran application, the additional thread is always spawned, even when setting OMP_SET_NUM_THREADS(1).

Granted, it doesn't look like it does much, per the Linux "ps -L" command, but I haven't seen any easily accessible information describing the purpose of the additional thread at a high level.

Thanks in advance.

7 帖子 / 0 new
最新文章
如需更全面地了解编译器优化,请参阅优化注意事项
Andrey Churbanov (Intel)的头像

Hi!

Could you please share a test case that would demonstrate your problem (small if possible). As I already replied to you in previous forum, this is unexpected behavior that might be caused by a bug in the OpenMP runtime. Or you may be observing some other thread, not the monitor launched by the OpenMP runtime. It is hard to say without test case.

Thanks,
Andrey

rmf166的头像

Andrey,

Okay, I guess the following code might work

      program parallel

!$    use omp_lib

      implicit none

      integer(4)            :: i, j, k
      integer(4), parameter :: nmax=500
!$    integer(4)            :: nthreads
      real(8)               :: a(2,nmax,nmax,nmax)

      ! Initialize
      a=1.0d0

!$    nthreads=omp_get_max_threads()

!$    call omp_set_num_threads(nthreads)

!$    write(*,*) 'NTHREADS= ', nthreads

!$omp parallel do private(i,j,k) reduction(+:a)
      do i=1, nmax
        do j=1, nmax
          do k=1, nmax
            a(2,i,j,k)=a(2,i,j,k)+a(1,i,j,k)
          end do
        end do
      end do
!$omp end parallel do

      end program parallel

I compiled as follows

$ make
ifort -O -openmp -openmp-link static -o test.exe main.f90

and ran the code

$ ./test.exe &

NTHREADS=            4

$ ps -L
  PID   LWP TTY          TIME CMD
 9517  9517 pts/8    00:00:00 csh
 9783  9783 pts/8    00:00:01 test.exe
 9783  9784 pts/8    00:00:00 test.exe
 9783  9785 pts/8    00:00:01 test.exe
 9783  9786 pts/8    00:00:01 test.exe
 9783  9787 pts/8    00:00:01 test.exe
 9788  9788 pts/8    00:00:00 ps

Single process number (PID), 4 threads requested, but 5 LWP shown by ps threads option (-L).

 

Andrey Churbanov (Intel)的头像

Hi,

I tried your example with the following result:

$ OMP_NUM_THREADS=4 ./a.out &
 NTHREADS=            4

$ ps -L
   PID    LWP TTY          TIME CMD
 63182  63182 pts/2    00:00:01 a.out
 63182  63183 pts/2    00:00:00 a.out
 63182  63184 pts/2    00:00:00 a.out
 63182  63185 pts/2    00:00:00 a.out
 63182  63186 pts/2    00:00:00 a.out
 63187  63187 pts/2    00:00:00 ps

$ OMP_NUM_THREADS=1 ./a.out &
 NTHREADS=            1

$ ps -L
   PID    LWP TTY          TIME CMD
 63190  63190 pts/2    00:00:01 a.out
 63191  63191 pts/2    00:00:00 ps

So I see the expected behavior of the OpenMP runtime: it creates 4 working threads + monitor thread for parallel execution, and no additional threads created for serial execution.

The purpose of the monitor thread is time bookkeeping that is used by working threads on barriers.

Regards,
Andrey

jimdempseyatthecove的头像

Andrey

Why can't the first thread to the barrier perform any desired bookkeeping?
(this would save a context switch)

Jim Dempsey

www.quickthreadprogramming.com
Andrey Churbanov (Intel)的头像

Jim,

The problem is that when OMP tasking is involved all working threads on barrier execute tasks. Probably it is possible to implement combination of tasks execution and time bookkeeping, but it does not look an easy project. If we dedicate one of working threads to time bookkeeping exclusively this will have significant performance impact.

Regards,
Andrey

jimdempseyatthecove的头像

Presumably, when OMP is tasking, you do not preempt a task, therefore barrier bookkeeping can be done by any thread before/after each task steal. The problem (resulting from tasking) then becomes you are unable to get all the threads entering the barrier to resume at ~ the same time if any of them are off performing a task. For algorithms requiring synchronicity task stealing is bad news. Meaning, if you are using omp task model, you probably should NOT use barriers. Or if you require barriers, consider the implications of adding tasking.

Jim Dempsey

 

www.quickthreadprogramming.com

登陆并发表评论。