I have a finite element code in which I am trying to parallelize a subroutine. The difference in the results between parallel and sequential computation is around 1E-7. I read in the StackOverflow post linked below that floating point operations are not commutative, so one should not expect identical results when performing calculations in multithreaded codes. How large can this type of error become? After several thousand time steps, would an error of 1E-7 be understandable?
Another issue we have considered is the precision of different threads. Are all the threads in a given computer guaranteed to have the same precision? Or could differing precision be contributing to the difference in the results?
Thank you for any information you can provide.