The following code is to illustrate a questionable(?) data race condition indicated by Intel Inspector XE 2011(build 206270) running under Ubuntu Linux 10.04 x86_64.
program test use omp_lib implicit none real, dimension(:), allocatable :: Da,DaLocal real :: di integer, parameter :: n=100, nset=16 integer :: i,iset,imin,imax allocate(Da(n)) ! *** ! Standard OpenMP Do-loop ! *** !$omp parallel do default(none) private(di) shared(i,Da) do i=1,size(Da) di = real(i) Da(i) = di end do !$omp end parallel do ! *** ! Blocked OpenMP Do-loop ! *** !$omp parallel default(none) private(DaLocal,di,i,imax) shared(Da) allocate(DaLocal(nset)) !$omp do schedule(static,1) do iset = 1, size(Da), nset ! Compute upper index of current block imax = min(size(Da), iset-1+nset) ! Loop over all 1,2,3... items in current set do i = 1, imax-iset+1 ! Perform computation di = real(i+iset-1) DaLocal(i) = di end do ! Copy local data into gobal array do i = 1, imax-iset+1 Da(i+iset-1) = DaLocal(i) end do end do !$omp end do deallocate(DaLocal) !$omp end parallel deallocate(Da) end program test
Intel Inspector XE running Analysis "Locate Deadlocks and Data Races" with option "Detect data races on stack accesses" enabled and "Scope" set to "Extremely thorough" indicates
1. Cross-thread stack access in line 34
2. Data race in line 34
First of all, I do think that the code is correct, is it?. I understand that the standard Do-loop is preferable in this case but the code is extracted from a larger application, where each thread needs to allocate a set of private working arrays and private derived types in order to perform some more complicated computations before it can copy the local results into the global array.
If the code is correct then is the data race problem reported by Intel Inspector XE really a "problem" which can be solved by a better implementation or is it an "internal problem" of the inspector tool?
Any help will be greatly appreaciated