How to Avoid Performance Penalties for Gradual-Underflow Behavior


Avoid the performance penalty associated with supporting floating-point gradual-underflow behavior in Fortran. The IEEE 754 and successor floating-point standards mandate the "gradual underflow" behavior, where a set of representable numbers is available that are smaller than the smallest number that is represented with full precision. For example, in standard single precision, numbers as small as 2.35e-38 are represented with nearly seven significant decimal digits. Numbers between 2.8e-45 and 2.35e-38 are represented with precision increasing from one significant bit at the low end up to nearly seven digits at the high end. This convention is necessary in order to preserve the accuracy of calculations involving differences between numbers of magnitude less than 1e-31. Numbers in the range of partial underflow are termed sub-normal (formerly de-normal).

On most processors that support pipelined computation, support of gradual underflow comes at a large cost.


Implement 'flush-to-zero' mode (FTZ). The Pentium® 4 processor supports calculation involving sub-normals with special code stored in 'Read Only Memory', which is not pipelined as are calculations in the normalized range. In order to avoid the performance penalty, such processors often support an abrupt-underflow mode, where sub-normals are "flushed" immediately to zero. Where an Intel processor supports FTZ, it is possible to set the mode at run-time and to change it to suit the requirements of a program that may require full protection of accuracy in certain code sequences and fast execution in others.

As FTZ is not present in high-level languages, special programming constructs are required to implement it. On Intel® architecture, FTZ can be implemented with inline assembler instructions, but there are at least four different varieties of asm() among the Microsoft Windows*- and Linux*-oriented compilers for Intel architecture. In order to provide portability among these compilers, the "IA Intrinsic" instructions are provided for the Intel, Microsoft, and Free Software Foundation* C compilers. Version requirements are as follows:

  • Intel® C++ Compiler version 6.0 or later
  • Microsoft Visual Studio* version 6, SP4 or later
  • gcc-3.1 or later


The following function provides a Fortran-callable interface that works with all Intel architectures that support FTZ modes:

#include "xmmintrin.h"

#if defined(__GNUC__) || defined(linux)

#define SETGRADUN setgradun_

#define SETABRPUN setabrpun_












The two functions above are employed to set gradual-underflow and abrupt-underflow modes. The #defines accommodate the default Fortran-to-C calling conventions of the Intel® Visual Fortran Compiler 8.0, Standard Edition for Windows and the g77 or Linux Fortran compilers. As it is rather complex to call a gcc function from an Intel Windows compiler, this scheme will work, as long as the calling convention options are not used to change from the default.

The following Fortran program demonstrates some effects of FTZ:

program testftz




write(*,*)'First, test with mode set by compiler'

write(*,*)'EPSILON(2.) =',eps

write(*,*)'minreal = 2.*MINEXPONENT(2.) =',realmin

write(*,*)'minreal*epsilon =',eps*realmin

write(*,*)'Now set gradual underflow'

call setgradun()



write(*,*)'minreal*epsilon =',epsminreal

write(*,*)'character formatted value =',emr


write(*,*)'after formatted read:',x

write(*,*)'Now set abrupt underflow aka flush-to-zero'

call setabrpun()

write(*,*)'minreal*epsilon =',eps*realmin

write(*,*)'previously stored value =',epsminreal


write(*,*)'after formatted read:',x

contains   !would you believe this changes results above?

function conmul(a,b)


end function conmul



This code employs the Fortran 90 intrinsic functions to calculate realmin, the smallest normalized number, and epsminreal, the smallest non-zero sub-normal number in gradual-underflow mode. In abrupt-underflow mode, realmin is the smallest non-zero number that can be produced by floating-point arithmetic. The same calculation that produces epsminreal in gradual-underflow mode will produce 0. in abrupt-underflow mode, if performed by Streaming SIMD Extensions (SSE)/Streaming SIMD Extensions2 (SSE2) or 64-bit Intel architecture floating-point instructions.

For the Intel IFL/ifc compilers, epsminreal is the smallest number that can be processed by formatted READ without raising an error condition. The smaller numbers, which raise an error in the run-time library of these compilers, are treated as 0. by the CVF and g77 compiler run-time libraries. The g77 compiler employs the C atof() function to make the READ conversion from ASCII to binary. Providing an atof() that is subject to the FTZ mode would make the READ behavior conform to that mode. As atof() does not raise an error for conversion of small inputs with a zero result, it is fairly certain that g77 programs w ill not raise an error condition for these numbers on any platform.

If this program is compiled in default floating-point mode by an IA-32 compiler, the FTZ modes have no effect, as the x87 floating point instructions are not affected by the SSE mask setting. In order to have the effect of speeding up calculations by use of abrupt underflow, the compiler must be instructed to generate SSE code, by options such as /QxK, /QxW, -xK and -xW (for the Intel compilers), or -march=pentium3 -mfpmath=sse (for g77). As the Intel options are only a suggestion to the compiler, it is still possible that x87 code may be generated, in which the FTZ setting has no effect. With the compilers tested, omitting the dummy internal function conmul produces x87 rather than SSE code, as may the /Op and -mp options.

The operation of this program shows that the Intel® Fortran Compiler 7.0 defaults to abrupt-underflow mode. The 64-bit Intel architecture EFL/efc compilers have a switch /Qftz- or -ftz- that changes the default, and this switch might be expected to appear in the IA-32 compilers. The g77 compiler might be expected to set FTZ mode with the option -ffast-math, but this has not been seen to happen with any of the IA-32 implementations. Examination of the code shows that the Intel Fortran compiler generates an implied _MM_SET_FLUSH_ZERO_MODE() at the top of the main program. Thus, the initial run-time mode setting will depend on which compiler is used to build the main program; if a C main() is used, it may be expected to initialize to gradual underflow (on IA-32 implementations), regardless of the Fortran compiler.

The example program includes tests to see whether the FTZ mode affects formatted conversion between ASCII and binary. While it is possible, in principle, for such an effect to occur, the tested IA-32 compilers and libraries do not show an FTZ mode effect. A READ of a number in sub-normal range produces the same sub-normal, regardless of FTZ mode. Similarly, a WRITE of a sub-normal number produces the same output, regardless of FTZ mode. These results are not surprising, as the run-time library is unlikely to use SSE/SSE2 instructions for these conversions. Fortran programs are not expected to spend a significant portion of their time formatting these small numbers, so accuracy over speed is usually a good choice.


Flush to Zero Mode in Fortran on Intel® Architecture


Per informazioni complete sulle ottimizzazioni del compilatore, consultare l'Avviso sull'ottimizzazione