Strange behaviour of private variable with OpenMP

Strange behaviour of private variable with OpenMP

Dear Intel users,

I have a strange behaviour with a PRIVATE variable. I'm using Intel Ifort 14.01

My program has many lines of code, so I report a snippet that summarize the situation. In the main I have an integer  global variable, a parallel region that sets that variabies, and a function ever in parallel region that uses the variable. If I define the variable PRIVATE, the code dies. If Ipass the same variable to a subroutine the code works fine, also if the results are not the same as serial code. This is the snippet that fails:

PROGRAM my_program

integer samplmin_newscal
integer samplmax, sampl0, deltaT 
REAL(stnd),        DIMENSION(  :  ), allocatable :: t0

samplmax = some_value
sampl0 = some_value
 deltaT  = some_value

...allocate and set t0

$!OMP PARALLEL DO PRIVATE (i, samplmin_newscal)
DO i

  samplmin_newscal = some_value
  call my_subroutine()

END DO


contains

my_subroutine()

real to_tmp

DO  ij  = samplmax, samplmin_newscal, -1

 t0_tmp = t0(ij + sampl0)+deltaT  ! here the code dies with 
                                  ! Subscript #1 of the array T0 has value 0 which 
                                  ! is less than the lower bound of 1 

END DO

END program my_program

 

This is the snippet that works well:

integer samplmin_newscal
integer samplmax, sampl0, deltaT 
REAL(stnd),        DIMENSION(  :  ), allocatable :: t0

samplmax = some_value
sampl0 = some_value
 deltaT  = some_value

...allocate and set t0

$!OMP PARALLEL DO PRIVATE (i, samplmin_newscal)
DO i

  samplmin_newscal = some_value
  call my_subroutine(samplmin_newscal)

END DO


contains

my_subroutine(samplmin_newscal_local)

integer samplmin_newscal_local
real to_tmp

DO  ij  = samplmax, samplmin_newscal_local, -1

 t0_tmp = t0(ij + sampl0)+deltaT

END DO

END program my_program

 

 

No, I know thas if a variable is defined inside a module, must be declared THREADPRIVATE, but this is not the case. Why the private definitions seems to fail? 

 

Thanks in advance.

19 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Your failing example looks problematic to me, as it seems to depend on the compiler in-lining the internal procedure correctly with specialization into each parallel or non-parallel region which may call it.  The ability which ifort always had and became mandated in f2003 to allow external calls to internal procedure also seems to complicate the situation.  I don't think the OpenMP standard gives sufficient attention to such Fortran 90 and later features.

If you want the situation examined by Intel's OpenMP experts, you may need to file a case on premier.intel.com.  I don't think OpenMP specialists watch this forum regularly.

I do not think the issue relates to private or not. To confirm this, configure the program such that it fails in the manner you describe. Run it to verify it fails as you describe. Then make one edit to the program:

call my_subroutine((samplmin_newscal)) ! add extra ()'s around argument

What the above does is make a stack local copy at the time of the call.

If the program fails, then it is not the case of private or not private, it may be something entirely else.

The subscript 0 is often indicative of:

a) the array in the calling program being 0-based, but the DUMMY being used in the subroutine being 1-based

b) the array in the calling program is 1-based, and you have a bug that produces a 0 index

Jim Dempsey

 

 

www.quickthreadprogramming.com

Hi , thanks for the reply. An update:

If i print samplmin_newscal before setting it print "-858993460", an unitialized value. Printing the same variable inside a subroutine, so after set to 1 (the first value), print ever "-858993460" !! It seem the setting instruction samplmin_newscal= 1 before the subroutine does not works.

 

 

 

 

Quote:

jimdempseyatthecove wrote:

I do not think the issue relates to private or not. To confirm this, configure the program such that it fails in the manner you describe. Run it to verify it fails as you describe. Then make one edit to the program:

call my_subroutine((samplmin_newscal)) ! add extra ()'s around argument

What the above does is make a stack local copy at the time of the call.

If the program fails, then it is not the case of private or not private, it may be something entirely else.

 

Hi Jim, II tried your suggestion and the program fails. But I don't understand why is not some problem with PRIVATE. The serial version works well. The same iphotetic bug you suggest on array index should appear also on that version.

Another info: I'm using just 1 thread.

I notice you are not using IMPLICIT NONE, perhaps do you have a spelling error resulting in using an undefined variable?

Jim Dempsey

www.quickthreadprogramming.com

IMPLICIT NONE statement is used, I've not reported because it is just a code snippet.

Using and older compiler ifort 11.1 the variable samplmin_newscal=1 before the routine, 0 inside the routine, instead of a negative value. The point is that should be 1 and I don't understand the reason. This is why the array try to use 0 index in the loop. There are no apparent reason to explain this behaviour. 

So the code you posted is not a snip of your actual code, rather it is a paraphrase of your actual code.

Normally one expects IMPLICIT NONE to catch typographical errors of variable names used in the subroutines and functions.

In the case of a CONTAINS subroutine or function it will not catch a typographical error, or programming oversight, that forms a mistyped of misused variable named that matches a name in the variable section of the module (or module hierarchy) or in the scope of the containing procedure. One source of error is you passed argument as represented by "samplmin_newscal_local". Being a contained routine, an oversight of omitting the "_local" on a statement in the contained routine would permit it access to the "samplemin_newscal" outside the scope of the subroutine.

I suggest that starting from the statement causing the error, start looking at the components that produced the 0, see why, then search outwards (and possibly up the call list) to find out why the components generated the 0.

Note, if you used the extra () trick you will have eliminated other threads from stomping on your passed argument (since it will be stack local to the calling thread).

Consider adding diagnostics to your code:

$!OMP PARALLEL DO PRIVATE (i, samplmin_newscal)
DO i

  samplmin_newscal = some_value
  if(samplmin_newscal .le. 0) then
   print *, omp_get_thread_num(), samplmin_newscall, LOC(samplmin_newscall)
   call sleepqq(10000)
   stop
  endif
  call my_subroutine(samplmin_newscal)

END DO


contains

my_subroutine(samplmin_newscal_local)

integer samplmin_newscal_local
real to_tmp

  if(samplmin_newscal_local .le. 0) then
   print *, omp_get_thread_num(), samplmin_newscall_local, LOC(samplmin_newscall_local)
   call sleepqq(10000)
   stop
  endif
DO  ij  = samplmax, samplmin_newscal_local, -1

 t0_tmp = t0(ij + sampl0)+deltaT

END DO

If the locations match then you have an issue of sharing, brought on by possible typo or compiler bug.

Also, if the above does not shed light on the problem add a similar check for (ij+sampl0).

To help check for typos, copy the entire subroutine, and paste it into a new source, with new subroutine name

subroutine my_subroutineX(...

That technique will expose variables used in the outer scope of the contained procedure. The list of errors, though not necessarily errors for the contained routine, can then can be used to assure you are not unintentionally using a variable from the wrong scope.

Jim Dempsey

www.quickthreadprogramming.com

I tried as you said. The locations mismatch, but by using the version without passing samplmin_newscal to a subroutine, so by using directly the PRIVATE variable, I've noted that the location mismatch again. I suppose that in this case should be the same exact locations. 

Address of samplmin_newscal from PARALLEL region. 140735202165760

Address in subroutine : 140735202169092

Thie explain why inside the subroutine samplmin_newscal has an uninitialized value. Compiler bug? But I tried two differents compiler version. 11 and 14 and the problem id present in both version. 

 

Does your CALL code inside the parallel region still have the extra ()'s?

If so, then they would be different.

If not, then does the interface for the subroutine use attribute VALUE on the dummy argument samplmin_newscal_local?

If so then the addresses will be different. Note the newer compilers now pass VALUE not as value but as REFERENCE to a temporary copy (same effect as adding the extra ()'s).

How about this.

In the debugger, place a break point on the CALL statement. At break point, Freeze all the other threads, observe what samplmin_newscal is (value and address). Use Step Into (note other threads must be frozen). When you get inside, to where body statements are being executed, what is (are) the value(s) and locations?

Jim Dempsey

www.quickthreadprogramming.com

Hi Jim,

by using Totalview, with just one thread, the variable samplmin_newscal before the subroutine has value 1 and address 140736773265360.

Inside the subroutine has value -858993460 and address 140736773292480, so it is different. I remember that samplemin_newscal is PRIVATE.

For some reason that variable disappear  inside the subroutine:(

 -858993460 decimal is FFFFFFFFCCCCCCCC hexadecimal. In Debug build, uninitialized (integer) variables get loaded with CCCC's.

The FFFFFFFFCCCCCCCC, is somewhat odd, except that this is indicative of copying in an uninitialized INTEGER(4) into an INTEGER(8)

Therefore, if the PRIVATE(samplemin_newscal) is (becomes) INTEGER(8), and where the out of parallel region samplemin_newscal is INTEGER(4) as well as uninitialized, then you would see this behavior. I doubt if this is the case, as a lot of code would break if the type-ness were not maintained.

An alternative is passing an uninitialized INTEGER(4) argument to a subroutine dummy declared as INTEGER(8) (have you enabled argument checking?).

Jim Dempsey

www.quickthreadprogramming.com

samplemin_newscal is INTEGER(4). Type checking is enabled and get no errors.

I wrote also sizeof(samplemin_newscal) and the size is 4  outside the subroutine and inside as well

I'm surprised that no one has addressed the question I raised whether it's feasible to call an internal procedure defined outside a parallel region from inside a parallel region.  As this wasn't even "legal" until f2003, I'm not expecting to find it addressed in OpenMP standard.

ifort has practically no OpenMP syntax check warning by default.  Ability to get through compile and link means little.  You need Inspector to get any facilities of that kind.  This might be an interesting test for Inspector. 

Oracle Fortran has some checking which doesn't depend on tools options, but the first thing that will happen in questionable cases is that it will silently drop parallelization, so that such issues don't bite you.

>>I wrote also sizeof(samplemin_newscal) and the size is 4

Sorry, my error with the programmer calculator, your variable contains CCCCCCCC == uninitialized.

Most of us, at least for myself, assume your original sketch code was paraphrased. Your sample had:

$!OMP PARALLEL DO PRIVATE (i, samplmin_newscal)
DO i

My assumption was this was

$!OMP PARALLEL DO PRIVATE (i, samplmin_newscal)
DO i = someBegin, someEnd

Which is which?

Jim Dempsey

 

www.quickthreadprogramming.com

I used the second one.

Quote:

Tim Prince wrote:

Your failing example looks problematic to me, as it seems to depend on the compiler in-lining the internal procedure correctly with specialization into each parallel or non-parallel region which may call it.  The ability which ifort always had and became mandated in f2003 to allow external calls to internal procedure also seems to complicate the situation.  I don't think the OpenMP standard gives sufficient attention to such Fortran 90 and later features.

If you want the situation examined by Intel's OpenMP experts, you may need to file a case on premier.intel.com.  I don't think OpenMP specialists watch this forum regularly.

Hi Tim, I suspect this is the problem. What do you suggest me? Many people suggest to pass everything as argument, but I don't understand if I have to pass only PRIVATE variables or also SHARED. Can I solve by moving the internal subroutine from main to a module?

And what about THREADPRIVATE variables? Must be passed as arguments as well?

Leave a Comment

Please sign in to add a comment. Not a member? Join today