GCC and Intel OpenMP Runtime Library

GCC and Intel OpenMP Runtime Library


why compiling an OpenMP program (simple reduction, sum of floating-point values and static scheduling) with GCC compiler, linking Intel OpenMP Runtime and setting the variable KMP_FORCE_REDUCTION=tree, the result is still non-deterministic? Is that because GCC does not support the 3 reduction methods (atomic, critical and tree (the only one deterministic)) and it supports only a non-deterministic method?

If so, in this case what the difference in using Intel OpenMP Runtime instead of GOMP?





5 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Reductions in gcc/gfortran openmp are handled entirely by the codegen.  There are no libgomp entry points associated with them, libiomp5 reduction entry points are not called by gcc-compiled code, and KMP_FORCE_REDUCTION has no effect.

If your program is non-deterministic, then it is a gcc bug.

The OpenMP 3.1 specification says that an implementation does not have to perform reductions deterministically, even with a fixed number of threads.

From the July 2011 Version 3.1 specification, section describing the "reduction clause", I found the words:

The location in the OpenMP program at which the values are combined and the order in which the values are combined are unspecified. Therefore, when comparing sequential and parallel runs, or when comparing one parallel run to another (even if the number of threads used is the same), there is no guarantee that bit-identical results will be obtained or that side effects (such as floating point exceptions) will be identical or take place at the same location in the OpenMP program.


John D. McCalpin, PhD "Dr. Bandwidth"

Right, Brian is being overly condemnatory in saying that this is a gcc bug; what gcc does is clearly permitted.  His main point remains valid, though. Changing the settings that affect reductions in the Intel OpenMP runtime won't have any effect on code compiled by gcc, because it never enters the runtime to perform the reduction.

In this case it is less of a "bug" in gcc than the "absence of a desirable feature that is not required by the standard but which is available with the Intel compilers."      :-)

John D. McCalpin, PhD "Dr. Bandwidth"

Leave a Comment

Please sign in to add a comment. Not a member? Join today