OpenMP reduction problem

OpenMP reduction problem


I am using Intel(R) Visual Fortran Compiler XE

Attached is a very simple program that sums up 1000 values.  If I do not use OpenMP, I am always getting an expected output of x= 1000 .

If I compile with OpenMP (/Qopenmp) in Debug mode, the output is unpredictable.  Sometimes it is x=1000, but many times it is completely different.

In Release mode, I get unpredictable output only if I disable inlining (/Ob0).

Does anyone know if I am doing something wrong?  The only unusual thing is that the reduction variable x is incremented inside a contained subroutine.  The program works if x is incremented inside the main loop.


Herunterladen test-reduction.f90425 Bytes
9 Beiträge / 0 neu
Letzter Beitrag
Nähere Informationen zur Compiler-Optimierung finden Sie in unserem Optimierungshinweis.

>>... The only unusual thing is that the reduction variable x is incremented inside a contained subroutine...

There is another issue and in case of using OpenMP processing x is the global variable and every OpenMP thread sees it and could change its value. If some synchronization is Not applied than the result is unpredictable ( this is what you have ).

Yes, as Sergey hinted, the reduction designation would apply to the use of the specialized use of copies of x inside the parallel region, not to the x outside that region which is visible to the internal subroutine and shared by all threads.

Bild des Benutzers jimdempseyatthecove

If you want increment() to use the x in the context of the reduction variable x inside the parallel region, then pass x as an argument to increment(x). This will pass the reference to the surrogate for x.

TimP (or more likely Steve if he reads this)

Consider the case in this program where the contained routine is inlined into the context of the OpenMP section. Which x would (should) be used then?
Jim Dempsey

Looks like another case for Nick Maclaren's misgivings about OpenMP.

In-lining optimization even without OpenMP already raised questions about data locality.  If in-lining affects this, it's likely to raise some bugs, as well as exposing others.

Thanks for the answers, it all makes sense now!  The part that initially confused me, was that when I compiled in Release mode, with inlining enabled, I always got the expected output. 


Bild des Benutzers jimdempseyatthecove


So your tests show that when not inlined the scope of x is global (expected), when inlined the scope of x is that at which point it is inlined.

My position is this is a bug. The code should behave from the context of the source (not the context when inlining occurs).

Can you run a test where x and the contains is inside a module (and where the contained function is inlined). This test is slightly different than when x and the contains routine is in scope of "PROGRAM".

Jim Dempsey


I did what you suggested.  The new code is attached.  The results are the same as before.  The inlined code produces consistent output, and the output from the non-inlined code is random.



Herunterladen test-reduction2.f90613 Bytes
Bild des Benutzers jimdempseyatthecove

Therefore bug is consistent between contains in module and contains in procedure.

Your likely work around is to pass the reduction variable(s) into the called subroutine

module increment_mod
   subroutine increment(x)
   implicit none
   integer x
      x = x + 1
   end subroutine increment
end module increment_mod
program test_reduction
implicit none
integer i, x
x = 0
!$OMP parallel do default(none), private(i), &
!$OMP&         reduction(+:x)
do i = 1, 1000
   call increment(x)
end do  ! i
!$OMP end parallel do
write(*,*) 'Expecting: x= 1000.  x=', x
end program test_reduction

Jim Dempsey

Melden Sie sich an, um einen Kommentar zu hinterlassen.