While working on my code, I have found that using !DIR$ SIMD to vectorize an outer loop along with -O3 flag produces incorrect result.
To track down causes, I have made an example source code which reproduces the problem except that not only does this example produce incorrect result with -O3 flag, but also with -O2, which suggests that the problem is related to the vectorization process.
The loop part is shown below, and the full source file is attached.
subroutine loop_A use mod_A implicit none integer :: j,i, i_f,i_l, idx_1,idx_2 do j = 1, nj i_f = NI(j-1) + 1 i_l = NI(j) !DIR$ SIMD do i = i_f, i_l idx_1 = i_stuff%i2idx(1,i) idx_2 = i_stuff%i2idx(2,i) a0(i) = & ( 1.d0 - d0(i) )*A(idx_1) + & d0(i) *A(idx_2) + & dot_product(e0(1:3,i), AA(1:3,idx_1)) enddo ! i loop enddo ! j loop ! check if the result is correct write(*,*) 'sum of all vars :',sum(a0) end subroutine loop_A
What I have is a nested loop where the second outer loop of the '10'th line above, is the one I want to vectorize.
Interestingly, I found that whenever the code works incorrectly, that is, with -O2 or -O3 with the example above and, -O3 with my original code, I see "Preprocess Loopnests: Moving Out Store" at the most outer loop. For instance, the vec-report from the example above says,
LOOP BEGIN at test_ver2.f90(96,15) inlined into test_ver2.f90(107,9)
remark #25084: Preprocess Loopnests: Moving Out Store [ test_ver2.f90(85,10) ]
which refers to the 7th line above.
To sum up, in my original code, when -O2 is turned on, the message "Preprocess Loopnests: Moving Out Store" does not show up and the code works fine, but when -O3 is turned on, the message shows up and the code produces wrong result. In the example case above, both -O2 and -O3 lead to the message "Preprocess Loopnests: Moving Out Store" in vec-report and incorrect result.
Any help to properly vectorize the second outer loop of the '10'th line will be deeply appreciated.