Allocatable private array within openmp parallel do directive

Allocatable private array within openmp parallel do directive

Hi, I have code which I can scale down to look like this:

program main
  implicit none
  integer l
  integer, allocatable, dimension(:) :: array

  array = 0

  allocate(array(10))

  !$omp parallel do private(array)
  do l = 1, 10
    array(l) = l
    print *, array(l)
  enddo
  !$omp end parallel do

  print *, array

  deallocate(array)

end

The prints are placeholder for some operations I do. So I just want to test if I can access the array correctly. Depending on how I compile, I get error messages:

ifort -openmp main.f90

           1
          10
           7
           5
*** glibc detected *** ./a.out: munmap_chunk(): invalid pointer: 0x           6
           2
*** glibc detected *** ./a.out: munmap_chunk(): invalid pointer: 0x00002b60c56c76b0 ***
           9
           8
*** glibc detected *** ./a.out: munmap_chunk(): invalid pointer: 0x00002b60d83ff6b0 ***
           3
           4
*** glibc detected *** ./a.out: munmap_chunk(): invalid pointer: 0x00002b60c52c66b0 ***
Aborted (core dumped)

or

ifort -openmp -check all main.f90

forrtl: severe (408): fort: (8): Attempt to fetch from allocatable variable ARRAY when it is not allocated

I searched a little and found these articles here in the forums [1, 2] which seem to deal with the same problems - allocatable array stated as private in an openmp directive. The solution was that it happens to be a bug and to update to at least 11.1.073 on Linux. I am using Linux with 11.1.073 and can't understand what causes the problems. Can anyone make a suggestion?

12 post / 0 nuovi
Ultimo contenuto
Per informazioni complete sulle ottimizzazioni del compilatore, consultare l'Avviso sull'ottimizzazione

You are accessing the array before allocating it.  Move "array=0" after the allocate.

This is not to say there is no such problem (I've been fighting a superficially similar one for over a year).  But you must get rid of obvious bugs in your reproducer.

You're right, I'm sorry. I was changing the test code so much I forgot about that part. Anyway, it doesn't help to move or delete the 'initialization'. If I compile the code

program main
  implicit none
  integer l
  integer, allocatable, dimension(:) :: array

  allocate(array(10))
 
  array = 0

  !$omp parallel do private(array)
  do l = 1, 10
    array(l) = l
    print *, array(l)
  enddo
  !$omp end parallel do

!   print *, array

  deallocate(array)

end

I get the error log

*** glibc detected *** ./a.out: munmap_chunk(): invalid pointer: 0x00007f4006ffa6f0 ***
======= Backtrace: =========
           1
*** glibc detected ***           10
           4
/opt/intel/cc/11.1/073/lib/intel64/libiomp5.so(__kmp_invoke_microtask           3
*** glibc detected *** ./a.out: munmap_chunk(): invalid pointer: 0x00007f40073fb6f0 ***
Aborted (core dumped)
springep@lara04 ~/Documents/openMPTest $ ifort -openmp main.f90
springep@lara04 ~/Documents/openMPTest $ ./a.out
*** glibc detected *** ./a.out: double free or corruption (out): 0x00007f0c5fbfd6f0 ***
           1
*** glibc detected *** ./a.out           2
           8
          10
           5
           9
/lib/x86_64-linux-gnu/libc.so.6(+0x7eb96)[0x7f0c66177b96]
*** glibc detected *** ./a.out: double free or corruption (out): 0x./a.out[0x4131ac]
*** glibc detected *** ./a.out: double free or corruption (out): 0xAborted (core dumped)

And I am wondering if this is still a compiler bug (which I thought should be solved within the version I use) or do I do something wrong?

As a work around, what happens if you add copyin(array) to the !$omp parallel do?
This may also require realloc_lhs to be in effect.

*** NOTE ***

You are aware that your sample loop is abstractly meaningless.

Only the master thread's slice of array would be updated in the array in the context after the parallel region. The work by the other threads would be gone.

Jim Dempsey

www.quickthreadprogramming.com

Using !$omp parallel do copyin(array) seems to work. Even without realloc_lhs.

And yes, I think I know that the code I've provided is useless. It is a massive melt down of what I want to do and only includes the critical part, private allocatables. What I really want to do is the following: Calculate the solution for a differential equation via Runge Kutta method (m steps). Do this for n initial values. Save this to an array (which has the size (m,n) to store every solution). The last one would be shared so that I can do whatever I want with it after the parallel region. In short I want to paralellize an outer loop while the inner is processed by one thread (because Runge Kutta can't be parallelized). Simplified my problem should look like this:

program main
  use omp_lib
  implicit none
 
  integer l, j, lMax, jMax
  integer, allocatable, dimension(:) :: array
  integer, allocatable, dimension(:, :) :: meanArray
 
  lMax = 2
  jMax = 5
 
  allocate(array(jMax), meanArray(jMax, lMax))
 
  array = 0
  meanArray = 0
 
  !$omp parallel copyin(array) shared(meanArray) private(l, j)
 
  !$omp do
  do l = 1, lMax
    !$omp critical
    do j = 1, jMax    ! This would be Runge Kutta
      array(l) = l * j    ! Placeholder for RungeKutta step
    end do
    meanArray(:, l) = array
    !$omp end critical
  enddo
  !$omp end do
 
  !$omp end parallel
 
  print *, meanArray
 
  deallocate(array, meanArray)

end

But that doesn't give me a proper result. The meanArray should look like this: ((1,2,3,4,5),(2,4,6,8,10)), but it doesn't.

In the case I have been concerned about, the private copies of the dynamically allocated arrays are used for scratch computation within the parallel region.  They are intended to be discarded at the end of the parallel region.  In the traditional extended F77 compiler form, automatic array is used rather than the presumably safer allocatable with error check. 

It is easily possible to exhaust stack when several threads make private copies of arrays.  Error checking in the allocate doesn't protect against the more likely problem upon entering the parallel region.

If there is a compiler related problem, it may be with respect to correct implicit allocation for the private copies of the array.

On a current installation, I don't see the quoted munmap failures, even when leaving stack at default.  Still, that is not to say there is no problem.

TimP,

>>It is easily possible to exhaust stack when several threads make private copies of arrays

Only when the arrays were non-allocatable (F77 style as you say). F90 allocatable arrays would only consume the memory required for the array descriptor. Probably not much more than 100 bytes (depending on the rank).

Jim Dempsey

www.quickthreadprogramming.com

Phillip,

In your latest post, I think the following would be better:

program main
  use omp_lib
  implicit none
  
  integer l, j, lMax, jMax
  integer, allocatable, dimension(:) :: array
  integer, allocatable, dimension(:, :) :: meanArray
  
  lMax = 2
  jMax = 5
  
  allocate(meanArray(jMax, lMax))
  
  meanArray = 0
  
  !$omp parallel copyin(array) shared(meanArray) private(array, i, j)
  
  allocate(array(jMax))
  array = 0
 
  !$omp do
  do l = 1, lMax
    do j = 1, jMax    ! This would be Runge Kutta
      array(l) = l * j    ! Placeholder for RungeKutta step
    end do
    meanArray(:, l) = array
  enddo
  !$omp end do
  deallocate(array)
  
  !$omp end parallel
  
  print *, meanArray
  
  deallocate(meanArray)
end
 

Jim Dempsey

www.quickthreadprogramming.com

Jim, if I take your code an compile it, I run into 'forrtl: severe (151): allocatable array is already allocated' errors.

Phil

Add private(array,i,j)

You tested before my last edit.

Jim

www.quickthreadprogramming.com

Jim, now it's working but the output is still not correct (in the sense that this is not what I want to do).

What indications do you have for incorrect output?

Note, typeo on my part. I used private(array, i, j) ! eye, jay

I notice now you are using l j (ell jay)

Crappy font cannot show difference between upper case eye (I) and lower case ell (l).

Correct code for consistent use (both eye or both ell)

Jim Dempsey

www.quickthreadprogramming.com

Accedere per lasciare un commento.