OpenMP reduction problem

OpenMP reduction problem

Hi,

I am using Intel(R) Visual Fortran Compiler XE 13.1.1.171.

Attached is a very simple program that sums up 1000 values.  If I do not use OpenMP, I am always getting an expected output of x= 1000 .

If I compile with OpenMP (/Qopenmp) in Debug mode, the output is unpredictable.  Sometimes it is x=1000, but many times it is completely different.

In Release mode, I get unpredictable output only if I disable inlining (/Ob0).

Does anyone know if I am doing something wrong?  The only unusual thing is that the reduction variable x is incremented inside a contained subroutine.  The program works if x is incremented inside the main loop.

Roman

AttachmentSize
Download test-reduction.f90425 bytes
9 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

>>... The only unusual thing is that the reduction variable x is incremented inside a contained subroutine...

There is another issue and in case of using OpenMP processing x is the global variable and every OpenMP thread sees it and could change its value. If some synchronization is Not applied than the result is unpredictable ( this is what you have ).

Yes, as Sergey hinted, the reduction designation would apply to the use of the specialized use of copies of x inside the parallel region, not to the x outside that region which is visible to the internal subroutine and shared by all threads.

jimdempseyatthecove's picture

If you want increment() to use the x in the context of the reduction variable x inside the parallel region, then pass x as an argument to increment(x). This will pass the reference to the surrogate for x.

TimP (or more likely Steve if he reads this)

Consider the case in this program where the contained routine is inlined into the context of the OpenMP section. Which x would (should) be used then?
Jim Dempsey

www.quickthreadprogramming.com

Looks like another case for Nick Maclaren's misgivings about OpenMP.

In-lining optimization even without OpenMP already raised questions about data locality.  If in-lining affects this, it's likely to raise some bugs, as well as exposing others.

Thanks for the answers, it all makes sense now!  The part that initially confused me, was that when I compiled in Release mode, with inlining enabled, I always got the expected output. 

Roman

jimdempseyatthecove's picture

Roman,

So your tests show that when not inlined the scope of x is global (expected), when inlined the scope of x is that at which point it is inlined.

My position is this is a bug. The code should behave from the context of the source (not the context when inlining occurs).

Can you run a test where x and the contains is inside a module (and where the contained function is inlined). This test is slightly different than when x and the contains routine is in scope of "PROGRAM".

Jim Dempsey

www.quickthreadprogramming.com

Jim,

I did what you suggested.  The new code is attached.  The results are the same as before.  The inlined code produces consistent output, and the output from the non-inlined code is random.

Roman

Attachments: 

AttachmentSize
Download test-reduction2.f90613 bytes
jimdempseyatthecove's picture

Therefore bug is consistent between contains in module and contains in procedure.

Your likely work around is to pass the reduction variable(s) into the called subroutine

module increment_mod
   contains 
   subroutine increment(x)
   implicit none
   integer x
      x = x + 1
   return
   end subroutine increment
end module increment_mod
   
!--------------------------------------------------   
   
program test_reduction
implicit none
integer i, x
x = 0
!$OMP parallel do default(none), private(i), &
!$OMP&         reduction(+:x)
do i = 1, 1000
   call increment(x)
end do  ! i
!$OMP end parallel do
write(*,*) 'Expecting: x= 1000.  x=', x
stop
end program test_reduction

Jim Dempsey

www.quickthreadprogramming.com

Login to leave a comment.