• 10/30/2018
  • Public Content
Contents

Use Lower Math Precision

OpenCL™ offers two basic ways to trade precision for speed:
  • native_*
    and
    half_*
    math built-ins, which have lower precision, but are faster than their un-prefixed variants
  • The compiler optimization options that enable optimizations for floating-point arithmetic for the whole OpenCL program, for example, the
    -cl-fast-relaxed-math
    flag.
In general, while the
-cl-fast-relaxed-math
flag is a quick way to get performance gains for kernels with many math operations, it does not permit fine numeric accuracy control. Consider experimenting with the
native_*
equivalents separately for each specific case, keeping track of the resulting accuracy.
Native_
versions of math built-ins are supported in hardware and run substantially faster, while offering lower accuracy. Use native trigonometry and transcendental functions, such as
sin
,
cos
,
exp
, and
log
, when performance is more important than precision.
For a full list of OpenCL build options and option descriptions, refer to the the OpenCL specification. For the instructions on how to use these options with the Intel® SDK for OpenCL™ Applications, refer to the following pages in the Developer Guide for Intel® SDK for OpenCL™ Applications: Build with OpenCL Offline Compiler Command Line Interface (for Intel® SDK for OpenCL™ Applications standalone version), Configuring OpenCL™ Build Options (for Intel® Code Builder for OpenCL™ API plugin for Microsoft Visual Studio*), Configuring Build Options (for Intel® Code Builder for OpenCL™ API plugin for Eclipse*).

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.