• 10/30/2018
  • Public Content
Contents

Use Floating Point for Calculations

Intel® Xeon® processors significantly accelerate floating-point calculations on the device.
Consider the following code snippet that performs calculations in
int
:
__kernel void scale (__constant uchar* srcA, __constant uchar* srcB, __constant uchar nSaturation, __global uchar* dst)         int offset = get_global_id();         uint tempSrcA = convert_uint(srcA[offset]);//Load one RGBA8 pixel         uint tempSrcB = convert_uint(srcB[offset]);//Load one RGBA8 pixel         //some processing         uint tempDst = (tempSrcA - tempSrcB) * nSaturation;         //store         dst[offset] = convert_uchar(tempDst); }
The following example uses the
float
equivalent:
__kernel void scale (__constant uchar* srcA, __constant uchar* srcB, __constant uchar nSaturation, __global uchar* dst)         int offset = get_global_id();         float tempSrcA = convert_float(srcA[offset]);//Load one RGBA8 pixel         float tempSrcB = convert_float(srcB[offset]);//Load one RGBA8 pixel         //some processing         float tempDst = (tempSrcA - tempSrcB) * nSaturation;         //store         dst[offset] = convert_uchar(tempDst); }
Using built-in functions improves performance. See the Use Built-In Functions section for more information.
NOTE
: The compiler is capable of automatic fusion of multiplies and adds. Use the
-cl-mad-enable
compiler flag to enable this optimization when compiling. Still, using explicit "mad" built-in ensures that the built-in is mapped directly to the efficient instruction.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.