Setting /Qopenmp changes the way that the compiler allocates storage for unsaved local variables. If you haven't defined those variables before you first reference them, then you will see changes in program behaviour. The change in storage allocation (static to stack based storage) could quite conceivably introduce "random" behaviour into applications. Consequently when faced with your symptoms I'd be looking really hard (including /check:all runs, etc) for such a programming error. If I didn't find any, I'd look again.

# OpenMP and floating point operations

Thank you for your answer. All storage is done on a large onedimensional common block. I solve the sparse matrix with a direct method (Cholesky). The problem (cfd) is non linear, so after solving the matrix the coefficents have to be updated and the matrix has to be solved again and again. When this has to be done with not too large martices about 100 times, everything is ok. But an large matrices and several hundred iterations, the result drifts away.

I really thing it is a problem of numerical accuracy. But I have no idea what to do.

You have to realise that OpenMP causes your program to be "non-deterministic", that is: the iterations are assigned to the various threads in a manner that you can not predict. Two runs of the program will probably result in different assignments. A second aspect is that floating-point operations are not associative and in combination with the non-determinism of OpenMP could cause seemingly random differences with a non-OpenMP version of your program.

However, in ordinary situations these differences should not be too large. So if they are, you may have a numerically unstable problem at hand or you have some programming error or the problem requires higher (double?) precision.

Regards,

Arjen

How can a non-deterministic behaviour occour, when I use only one thread?

I use real*8 variables. The mathematical problem is stabel and the code runs perfeclty on skalar machines since 20 years.

I wonder, why the difficulties are present even when running one thread only. What could that be?

Can you be sure - even with one thread - that the iterations are run in the _same_ order?

Can you be sure that all variables within the OpenMP loops have the correct private or shared attribute?

Without looking at the code, this is the sort of things I can come up with.

Oh, have you verified that the loops are running in one thread? I had to solve a problem with one of our programs which turned out to be due to an OpenMP-enabled loop to run in the maximum number of threads instead of a single, despite having used OMP_SET_NUM_THREADS(1). See my message on comp.lang.fortran from yesterday. A bizarre problem and I simply do not understand how it could occur in the first place.

Regards,

Arjen

Among the usual checks for such situations of numerical inconsistency is to make smaller changes in compile options. Declaring your procedures RECURSIVE or setting an option such as /Qauto (stack storage of local arrays) would help check whether you have an error where the result depends accidentally on uninitialized data, but that should not be your only check.

If you have a variable in a COMMON and also pass it in via subroutine arguments, that's an error which could surface in such situations. COMMON usage can pose difficulties with OpenMP; in the case where private variables are required, it may require threadprivate copies of COMMON, often requiring default(none) to cause the compiler to flag unintended defaults.

Thank you all for your advices. I will investigate ......

Thanks and regards,

Michael

Hello,

I found out what the problem could be.

Here is an example piece of code:

Real (KIND=8) :: A, VEX,VEY

A=0.002

VEX=0.7071063969767235057872767 ! result of an other computation

VEY=0.707107165396162584691808

R1=A * VEX * VEX

R2=A * VEY * VEX

R3=A * VEX * VEY

R4=A * VEY * VEY

write(6,'(6F30.25)') VEX,VEY,R1,R2,R3,R4

This results in:

R1=0.0009999989607882072296247

R2=0.0010000000474968604152054

R3=0.0010000000474968606320458

R4=0.0010000011342066949474733

R2 and R3 should be identical (commutative law, VEX * VEY = VEY * VEX).

They are identical when compiling without OMP,

but the are not when compiling with OMP and without any OMP-directive.

The differences disturb the solution of the sparse matrix.

The solution with and without OMP are much different.

But this is not always the case. When using the following numbers, f.e.,

everything is ok:

A=0.002

VEX=0.7071159333124231727296660

VEY=0.7070976289422138405527107

results in:

R1=0.0010000259337872793603125

R2=0.0010000000471624013922284

R3=0.0010000000471624013922284

R4=0.0009999741612076232504663

And the effect is not there when setting A=1.0

So there must be a difference in handling floating-point-operation..

I know, that using Kind=8 - variables is only precise for about 15 digits.

But the digits far behind the point have an effect on the solution.

Do you have an idea what to do?

Thanks and best regards,

Michael

Hello,

I found out what the problem could be.

Here is an example piece of code:

Real (KIND=8) :: A, VEX,VEY

A=0.002

VEX=0.7071063969767235057872767 ! result of an other computation

VEY=0.707107165396162584691808

R1=A * VEX * VEX

R2=A * VEY * VEX

R3=A * VEX * VEY

R4=A * VEY * VEY

write(6,'(6F30.25)') VEX,VEY,R1,R2,R3,R4

This results in:

R1=0.0009999989607882072296247

R2=0.0010000000474968604152054

R3=0.0010000000474968606320458

R4=0.0010000011342066949474733

R2 and R3 should be identical (commutative law, VEX * VEY = VEY * VEX).

They are identical when compiling without OMP,

but the are not when compiling with OMP and without any OMP-directive.

The differences disturb the solution of the sparse matrix.

The solution with and without OMP are much different.

But this is not always the case. When using the following numbers, f.e.,

everything is ok:

A=0.002

VEX=0.7071159333124231727296660

VEY=0.7070976289422138405527107

results in:

R1=0.0010000259337872793603125

R2=0.0010000000471624013922284

R3=0.0010000000471624013922284

R4=0.0009999741612076232504663

And the effect is not there when setting A=1.0

So there must be a difference in handling floating-point-operation..

I know, that using Kind=8 - variables is only precise for about 15 digits.

But the digits far behind the point have an effect on the solution.

Do you have an idea what to do?

Thanks and best regards,

Michael

Hello,

I checked the code and I didnt find a bug or an not initialized variable or anything else. No reductions are present. The problems seems to be caused really by the kind floatingpoint-opreations are done.

I changed the numerical formulation of my differential equation. Now I do less floatingpoint operations. This slows the convergence, but it was the only way to get the program producing correct results when using openmp.

This seems to be a peril. Will I have to be aware always of instability when using openmp? I dont believe that I'm the only one who noticed such problems.

Regards

Michael

A=0.002 VEX=0.7071063969767235057872767 ! result of an other computation VEY=0.707107165396162584691808

If that is literally from your test code - note that you are assigning single precision values to a double precision variable.

I don't think your problem is necessarily OpenMP related - you are simply looking for greater precision from REAL(8) than it can offer. Moving to even higher precision or reformulating your problem are really the only robust ways forward. I can't reproduce your results for your specific examples (you'd need to give the exact program and specify all the compile options that you used), but use of OpenMP probably results in the compiling chosing to order calculations in a slightly different manner that exposes the precision limitations - (A*VEX)*VEY versus A*(VEX*VEY) for example. Mathematically they are equivalent, computationally they are not. Without parentheses to guide the compiler (and the relevant compile option to force the compiler to follow the standard) its free to reorder the evaluation of such expressions.

No no, I dont assign single values to double.

I think you are right, the compiler uses different orders or something else, when compiling with or without openmp. In the case of my computations it is significant, but I have no chance to predict it.

I guess, the compiler works very different when using openmp (without any direvtive). There is no option to force the compiler building code in exact the same matter as without openmp, right?

What do you recommend for development code? First wrighting and testing code without openmp? The compilers semm to be rocksolide, Intel and Lahey. After that switchung to openmp an looking for differences in the results?

Thank you

Michael

## OpenMP and floating point operations

Hello,

I have a question about the precision of floating point operations while using openmp. I solve a large sparse matrix.

Doing this without openmp, the code works. Running it several times the results are identical each time. But when I compile it with openmp, the results are not identical. This is even the case, when I remove all !$OMP directives, so I shurely have no race conditions.

And the results differ between a computation without using openmp and a computation with using openmp and using one thread only.

I use the following oponmp-options:

/Qopenmp

/Qopenmp-report:2

/threads

I guess, the numerical precision is set different when using openmp and rounding errors occour or something else. In the documentation are several compiler options for setting the numerical behavior, but this didnt help me. Setting /fp:precise has no effect.

Do you have an idea or some advice for me?

Thank you and best regards,

Michael