Developer Guide and Reference

Contents

Overview: Intrinsics to Convert Half Float Types

The half-float or 16-bit float is a popular type in some application domains. The half-float type is regarded as a storage type because although data is often stored as a half-float, computation is never done on values in these type. Usually values are converted to regular 32-bit floats before any computation.
Support for half-float type is restricted to just conversions to/from 32-bit floats. The main benefits of using half float type are:
  • reduced storage requirements
  • less consumption of memory bandwidth and cache
  • accuracy and precision adequate for many applications

Half Float Intrinsics

The half-float intrinsics are provided to convert half-float values to 32-bit floats for computation purposes and conversely, 32-bit float values to half-float values for data storage purposes.
The intrinsics are translated into library calls that do the actual conversions.
The half-float intrinsics are available on IA-32 and Intel® 64 architectures running supported operating systems. The minimum processor requirement is an Intel® Pentium 4 processor and an operating system supporting Intel® Streaming SIMD Extensions 2 (Intel® SSE2) instructions.

Role of Immediate Byte in Half Float Intrinsic Operations

For all half-float intrinsics an immediate byte controls rounding mode, flush to zero, and other non-volatile set values. The format of the
imm8
byte is as shown in the diagram below.
The
imm8
value is used for special
MXCSR
overrides.
In the diagram,
  • MBZ = Most significant Bit is Zero; used for error checking
  • MS1 = 1 : use MXCSR RC, else use imm8.RC
  • SAE = 1 : all exceptions are suppressed
  • MS2 = 1 : use MXCSR FTZ/DAZ control, else use imm8.FTZ/DAZ.
The compiler passes the bits to the library function, with error checking - the most significant bit must be zero.