Consistency of Floating-Point Results using the Intel® Compiler
Why doesn’t my application always give the same answer?
Dr. Martyn J. Corden
Software Solutions Group
Binary floating-point [FP] representations of most real numbers are inexact, and there is an inherent uncertainty in the result of most calculations involving floating-point numbers. Programmers of floating-point applications typically have the following objectives:
o Produce results that are “close” to the result of the exact calculation
- Usually measured in fractional error, or sometimes “units in the last place” (ulp).
o Produce consistent results:
- From one run to the next;
- From one set of build options to another;
- From one compiler to another
- From one processor or operating system to another
o Produce an application that runs as fast as possible
These objectives usually conflict! However, good programming practices and judicious use of compiler options allow you to control the tradeoffs.
For example, it is sometimes useful to have a degree of reproducibility that goes beyond the inherent accuracy of a computation. Some software quality assurance tests may require close, or even bit-for-bit, agreement between results before and after software changes, even though the mathematical uncertainty in the result of the computation may be considerably larger. The right compiler options can deliver consistent, closely reproducible results while preserving good (though not optimal) performance.
Compiler options let you control the tradeoffs between accuracy, reproducibility and performance. Use
/fp:precise /fp:source (Windows*) or
-fp-model precise -fp-model source (Linux* or macOS*)
to improve the consistency and reproducibility of floating-point results while limiting the impact on performance.
If reproducibility between different processor types of the same architecture is important, use also
/Qimf-arch-consistency:true (Windows) or
-fimf-arch-consistency=true (Linux or macOS)
For best reproducibility between processors that support FMA instructions and processors that do not, use also /Qfma- (Windows) or -no-fma (Linux or macOS). In the version 17 compiler or later, best reproducibility may be obtained with the single switch /fp:consistent (Windows) or -fp-model consistent (Linux or macOS), which sets all of the above options.
Starting with the version 18 compiler, for applications with vectorizable loops containing math functions, it may be possible to improve performance whilst maintaining best reproducibility by adding /Qimf-use-svml (Windows) or -fimf-use-svml(Linux or macOS).
For the complete article, updated for version 19 update 1 of the Intel® Compiler, please open the attached PDF file.
See here for a comparison to Intel® MIC Architecture.
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804