I have spent some time in debugging a code that was threaded with OpenMP and compiled with ifort version 12.0.4. After trying a couple of debugging tools: valgrind, intel inspector, ... I could not spot the problem.
I have tried another version of ifort (version 12.1.3) and the problem disappeared, which was quite weird.
After looking at the vectorization report I realized that there was a loop which was vectorized by the old version 12.0.4 and not by the newer one 12.1.3. After that, I have added a pragma novec to the loop and compiled the source code using ifort 12.0.4. Hence, preventing the compiler from vectorizing this loop seems to be the right fix for my bug.
I have also tried to force the new version 12.1.3 to vectorise the loop with a pragma ivdep and the compiler reported the loop as vectorized. This is also working fine.
So my question is : Do you see any relationship between the OpenMP parallelization and the vectorization that could lead to the bug I've experienced.