That FP arithmetic can be tricky in multi-threaded environment is well known. However, I am getting strange results depending on whether I use -O2 optimization or not, even though the code itself is serial. In particular, in the enclosed sample code, four complex arrays, z1, z2, z3, and z4, are initialized with random numbers so that their value ranges are pairwise at different scales. Then the operation z1*z2 + z3*z4 performed elementwise and stored in an array z. Finally sum(z) is computed. If I compile with -O2 flag, the result depends on whether the operands are in the above order, or like: z2*z1 + z4*z3 (controlled at runtime by a command line argument). If I compile with -g, or with -fp-model precise, the two results are identical. Can anyone explain why this could be happening?