OpenMP and floating point operations

OpenMP and floating point operations

Imagen de Heinz Michael L.

Hello,

I have a question about the precision of floating point operations while using openmp. I solve a large sparse matrix. 

Doing this without openmp, the code works. Running it several times the results are identical each time. But when I compile it with openmp, the results are not identical. This is even the case, when I remove all !$OMP directives, so I shurely have no race conditions.

 And the results differ between a computation without using openmp and a computation with using openmp and using one thread only.

I use the following oponmp-options:
/Qopenmp
/Qopenmp-report:2
/threads

I guess, the numerical precision is set different when using openmp and rounding errors occour or something else. In the documentation are several compiler options for setting the numerical behavior, but this didnt help me. Setting /fp:precise has no effect.

Do you have an idea or some advice for me?

Thank you and best regards,
Michael

publicaciones de 13 / 0 nuevos
Último envío
Para obtener más información sobre las optimizaciones del compilador, consulte el aviso sobre la optimización.
Imagen de IanH

Setting /Qopenmp changes the way that the compiler allocates storage for unsaved local variables.  If you haven't defined those variables before you first reference them, then you will see changes in program behaviour.  The change in storage allocation (static to stack based storage) could quite conceivably introduce "random" behaviour into applications.  Consequently when faced with your symptoms I'd be looking really hard (including /check:all runs, etc) for such a programming error.  If I didn't find any, I'd look again.

Imagen de Heinz Michael L.

Thank you for your answer. All storage is done on a large onedimensional common block. I solve the sparse matrix with a direct method (Cholesky). The problem (cfd) is non linear, so after solving the matrix the coefficents have to be updated and the matrix has to be solved again and again. When this has to be done with not too large martices about 100 times, everything is ok. But an large matrices and several hundred iterations, the result drifts away. 

I really thing it is a problem of numerical accuracy. But I have no idea what to do.

 

 

 

Imagen de arjenmarkus

You have to realise that OpenMP causes your program to be "non-deterministic", that is: the iterations are assigned to the various threads in a manner that you can not predict. Two runs of the program will probably result in different assignments. A second aspect is that floating-point operations are not associative and in combination with the non-determinism of OpenMP could cause seemingly random differences with a non-OpenMP version of your program.

However, in ordinary situations these differences should not be too large. So if they are, you may have a numerically unstable problem at hand or you have some programming error  or the problem requires higher (double?) precision.

Regards,

Arjen

Imagen de Heinz Michael L.

How can a non-deterministic behaviour occour, when I use only one thread?

I use real*8 variables. The mathematical problem is stabel and the code runs perfeclty on skalar machines since 20 years.

I wonder, why the difficulties are present even when running one thread only. What could that be?

Imagen de arjenmarkus

Can you be sure - even with one thread - that the iterations are run in the _same_ order?

Can you be sure that all variables within the OpenMP loops have the correct private or shared attribute?

Without looking at the code, this is the sort of things I can come up with.

Oh, have you verified that the loops are running in one thread? I had to solve a problem with one of our programs which turned out to be due to an OpenMP-enabled loop to run in the maximum number of threads instead of a single, despite having used OMP_SET_NUM_THREADS(1). See my message on comp.lang.fortran from yesterday. A bizarre problem and I simply do not understand how it could occur in the first place.

Regards,

Arjen

Imagen de Tim Prince

Among the usual checks for such situations of numerical inconsistency is to make smaller changes in compile options.  Declaring your procedures RECURSIVE or setting an option such as /Qauto (stack storage of local arrays) would help check whether you have an error where the result depends accidentally on uninitialized data, but that should not be your only check.

If you have a variable in a COMMON and also pass it in via subroutine arguments, that's an error which could surface in such situations.  COMMON usage can pose difficulties with OpenMP; in the case where private variables are required, it may require threadprivate copies of COMMON, often requiring default(none) to cause the compiler to flag unintended defaults.

Imagen de Heinz Michael L.

Thank you all for your advices. I will investigate ......

Thanks and regards,

Michael

 

 

 

Imagen de Heinz Michael L.

Hello,

I found out what the problem could be.

Here is an example piece of code:

Real (KIND=8) :: A, VEX,VEY

A=0.002
VEX=0.7071063969767235057872767     ! result of an other computation
VEY=0.707107165396162584691808

R1=A * VEX * VEX
R2=A * VEY * VEX
R3=A * VEX * VEY
R4=A * VEY * VEY

write(6,'(6F30.25)') VEX,VEY,R1,R2,R3,R4

This results in:
R1=0.0009999989607882072296247
R2=0.0010000000474968604152054
R3=0.0010000000474968606320458
R4=0.0010000011342066949474733

R2 and R3 should be identical (commutative law, VEX * VEY = VEY * VEX).
They are identical when compiling without OMP,
but the are not when compiling with OMP and without any OMP-directive.

The differences disturb the solution of the sparse matrix.
The solution with and without OMP are much different.

But this is not always the case. When using the following numbers, f.e.,
everything is ok:

A=0.002
VEX=0.7071159333124231727296660
VEY=0.7070976289422138405527107

results in:

R1=0.0010000259337872793603125
R2=0.0010000000471624013922284
R3=0.0010000000471624013922284
R4=0.0009999741612076232504663

And the effect is not there when setting A=1.0

So there must be a difference in handling floating-point-operation..

I know, that using Kind=8 - variables is only precise for about 15 digits.
But the digits far behind the point have an effect on the solution.

Do you have an idea what to do?

Thanks and best regards,

Michael

 

 

Imagen de Heinz Michael L.

Hello,

I found out what the problem could be.

Here is an example piece of code:

Real (KIND=8) :: A, VEX,VEY

A=0.002
VEX=0.7071063969767235057872767      ! result of an other computation
VEY=0.707107165396162584691808

R1=A * VEX * VEX
R2=A * VEY * VEX
R3=A * VEX * VEY
R4=A * VEY * VEY

write(6,'(6F30.25)') VEX,VEY,R1,R2,R3,R4

This results in:
R1=0.0009999989607882072296247
R2=0.0010000000474968604152054
R3=0.0010000000474968606320458
R4=0.0010000011342066949474733

R2 and R3 should be identical (commutative law, VEX * VEY = VEY * VEX).
They are identical when compiling without OMP,
but the are not when compiling with OMP and without any OMP-directive.

The differences disturb the solution of the sparse matrix.
The solution with and without OMP are much different.

But this is not always the case. When using the following numbers, f.e.,
everything is ok:

A=0.002
VEX=0.7071159333124231727296660
VEY=0.7070976289422138405527107

results in:

R1=0.0010000259337872793603125
R2=0.0010000000471624013922284
R3=0.0010000000471624013922284
R4=0.0009999741612076232504663

And the effect is not there when setting A=1.0

So there must be a difference in handling floating-point-operation..

I know, that using Kind=8 - variables is only precise for about 15 digits.
But the digits far behind the point have an effect on the solution.

Do you have an idea what to do?

Thanks and best regards,

Michael

 

Imagen de Heinz Michael L.

Hello,

I checked the code and I didnt find a bug or an not initialized variable or anything else. No reductions are present. The problems seems to be caused really by the kind floatingpoint-opreations are done.

I changed the numerical formulation of my differential equation. Now I do less floatingpoint operations. This slows the convergence, but it was the only way to get the program producing correct results when using openmp.

This seems to be a peril. Will I have to be aware always of instability when using openmp? I dont believe that I'm the only one who noticed such problems.

Regards

Michael

Imagen de IanH

A=0.002
VEX=0.7071063969767235057872767      ! result of an other computation
VEY=0.707107165396162584691808

If that is literally from your test code - note that you are assigning single precision values to a double precision variable.

I don't think your problem is necessarily OpenMP related - you are simply looking for greater precision from REAL(8) than it can offer.  Moving to even higher precision or reformulating your problem are really the only robust ways forward.  I can't reproduce your results for your specific examples (you'd need to give the exact program and specify all the compile options that you used), but use of OpenMP probably results in the compiling chosing to order calculations in a slightly different manner that exposes the precision limitations - (A*VEX)*VEY versus A*(VEX*VEY) for example.  Mathematically they are equivalent, computationally they are not.  Without parentheses to guide the compiler (and the relevant compile option to force the compiler to follow the standard) its free to reorder the evaluation of such expressions.

Imagen de Heinz Michael L.

No no, I dont assign single values to double.

I think you are right, the compiler uses different orders or something else, when compiling with or without openmp. In the case of my computations it is significant, but I have no chance to predict it. 

I guess, the compiler works very different when using openmp (without any direvtive). There is no option to force the compiler building code in exact the same matter as without openmp, right?

What do you recommend for development code? First wrighting and testing code without openmp? The compilers semm to be rocksolide, Intel and Lahey. After that switchung to openmp an looking for differences in the results?

Thank you

Michael

Inicie sesión para dejar un comentario.