OpenCL™ software technology offers a library of built-in functions, including vector variants. Using the built-in functions is typically more efficient than implementing them manually in OpenCL code. For example, consider the following code example:

int tid = get_global_id(0);
c[tid] = 1/sqrt(a[tid] + b[tid]);

The following code uses the built-in rsqrt function to implement the same example more efficiently:

int tid = get_global_id(0);
c[tid] = rsqrt(a[tid] + b[tid]);

See other examples of simple expressions and built-ins based equivalents below:

dx * fCos + dy * fSin == dot( (float2)(dx, dy),(float2)(fCos, fSin))
x * a - b  == mad(x, a, -b)
sqrt(dot(x, y)) == distance(x,y)

The only exception is using mul24 as it involves redundant overflow-handling logic:

int iSize = x*y;//prefer general multiplication, not mul24(x,y);

Also use specialized built-in versions where possible. For example, when the x value for xy is ≥0, use powr instead of pow.

See Also

The OpenCL 2.0 C Specification at https://www.khronos.org/registry/cl/specs/opencl-2.0-openclc.pdf

Для получения подробной информации о возможностях оптимизации компилятора обратитесь к нашему Уведомлению об оптимизации.