Consistency of Floating-Point Results using the Intel® Compiler
Why doesn’t my application always give the same answer?
Dr. Martyn J. Corden
Software Solutions Group
Binary floating-point [FP] representations of most real numbers are inexact, and there is an inherent uncertainty in the result of most calculations involving floating-point numbers. Programmers of floating-point applications typically have the following objectives:
o Produce results that are “close” to the result of the exact calculation
- Usually measured in fractional error, or sometimes “units in the last place” (ulp).
o Produce consistent results:
- From one run to the next;
- From one set of build options to another;
- From one compiler to another
- From one processor or operating system to another
o Produce an application that runs as fast as possible
These objectives usually conflict! However, good programming practices and judicious use of compiler options allow you to control the tradeoffs.
For example, it is sometimes useful to have a degree of reproducibility that goes beyond the inherent accuracy of a computation. Some software quality assurance tests may require close, or even bit-for-bit, agreement between results before and after software changes, even though the mathematical uncertainty in the result of the computation may be considerably larger. The right compiler options can deliver consistent, closely reproducible results while preserving good (though not optimal) performance.
Compiler options let you control the tradeoffs between accuracy, reproducibility and performance. Use
/fp:precise /fp:source (Windows*) or
-fp-model precise -fp-model source (Linux* or macOS*)
to improve the consistency and reproducibility of floating-point results while limiting the impact on performance.
If reproducibility between different processor types of the same architecture is important, use also
/Qimf-arch-consistency:true (Windows) or
-fimf-arch-consistency=true (Linux or macOS)
For best reproducibility between processors that support FMA instructions and processors that do not, use also /Qfma- (Windows) or -no-fma (Linux or macOS). In the version 17 compiler or later, best reproducibility may be obtained with the single switch /fp:consistent (Windows) or -fp-model consistent (Linux or macOS), which sets all of the above options.
Starting with the version 18 compiler, for applications with vectorizable loops containing math functions, it may be possible to improve performance whilst maintaining best reproducibility by adding /Qimf-use-svml (Windows) or -fimf-use-svml(Linux or macOS).
For the complete article, updated for version 19 update 1 of the Intel® Compiler, please open the attached PDF file.
See here for a comparison to Intel® MIC Architecture.
英特尔的编译器针对非英特尔微处理器的优化程度可能与英特尔微处理器相同（或不同）。这些优化包括 SSE2、SSE3 和 SSSE3 指令集和其他优化。对于在非英特尔制造的微处理器上进行的优化，英特尔不对相应的可用性、功能或有效性提供担保。该产品中依赖于微处理器的优化仅适用于英特尔微处理器。某些非特定于英特尔微架构的优化保留用于英特尔微处理器。关于此通知涵盖的特定指令集的更多信息，请参阅适用产品的用户指南和参考指南。