• 2019 Update 4
  • 03/20/2019
  • Public Content
Contents

Using Floating Point for Calculations

Intel® Graphics device is much faster for floating-point
add
,
sub
,
mul
and so on in compare to the
int
type.
For example, consider the following code that performs calculations in type
int4
:
__kernel void amp (__constant uchar4* src, __global uchar4* dst) … uint4 tempSrc = convert_uint4(src[offset]);//Load one RGBA8 pixel //some processing uint4 value = (tempSrc.z + tempSrc.y + tempSrc.x); uint4 tempDst = value + (tempSrc - value) * nSaturation; //store dst[offset] = convert_uchar4(tempDst); }
Below is its
float4
equivalent:
__kernel void amp (__constant uchar4* src, __global uchar4* dst) … uint4 tempSrc = convert_uint4(src[offset]);//Load one RGBA8 pixel //some processing float4 value = (tempSrc.z + tempSrc.y + tempSrc.x); float4 tempDst = mad(tempSrc – value, fSaturation, value); //store dst[offset] = convert_uchar4(tempDst); }
Intel® Advanced Vector Extensions (Intel® AVX) support (if available) accelerates floating-point calculations on the modern CPUs, so floating-point data type is preferable for the CPU OpenCL device as well.
Note
The compiler can perform automatic fusion of multiplies and additions. Use compiler flag
-cl-mad-enable
to enable this optimization when compiling for both Intel® Graphics and CPU devices. However, explicit use of the "mad" built-in ensures that it is mapped directly to the efficient instruction.

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804