Intel® OpenCL™ Graphics Extensions

By Jeffrey McAllister, published on September 07 , 2016

OpenCL Extensions available in Intel® SDK for OpenCL™ Applications

The following tables contain information about extensions to the Khronos Group OpenCL™ standard available for Intel processors.   

Notice: Not all extensions are available in all versions of the OpenCL drivers for each OS. Some features are only available on certain hardware platforms or in certain driver baselines. 

 

Preview Extensions

General info on preview extensions: /content/www/us/en/develop/articles/intel-opencl-experimental-features.html

 

Media Extensions

These extensions enable video processing applications to access hardware features in Intel processors.

Extension Name Description Supported HW Notes
cl_intel_device_side_avc_motion_estimation

See below.  Provides programmers with a macroblock-level interface to the motion estimation functionality available in the Intel graphics processor media sampler. It describes the specification of low-level built-in functions, callable from OpenCL kernels, to evaluate AVC motion estimation operations. It covers everything the host side motion estimation extensions can do and more.

Sample: Intro to Device Side AVC Motion Estimation

Gen9

(new in Linux SRB4, not yet available for Windows.)

cl_intel_advanced_motion_estimation

cl_intel_motion_estimation

 

Provides a frame-level interface implemented as built-in kernels to accelerate motion estimation operations. Supports AVC block sizes, inter/intra estimation, skip checks, and motion vector costing.

Notes:

  • Version 2 of this spec was introduced in early 2016.
  • Advanced VME allows access to a superset of features of the original cl_intel_motion_estimation extension  

For more info:https://software.intel.com/en-us/articles/intro-to-advanced-motion-estimation-extension-for-opencl

Motion estimation samples available in Media Server Studio samples

More info: Spec

 
cl_intel_packed_yuv

YUV is usually a planar format.  This extension provides support for a few specific formats of packed YUV images. 

More info: Spec

 
cl_intel_planar_yuv

Provides support for the Planar YUV (YCbCr) image formats.

More info: Spec

Gen9

(new in Linux SRB4, not yet available for Windows.)

cl_intel_media_block_io

Built-in functions to facilitate the reading and writing of flexible 2D regions from images.  Augments Intel vendor extensions cl_intel_subgroups and cl_intel_subgroups_short.

More info: Spec

Gen9

(new in Linux SRB4, not yet available for Windows.)

VEBox preview extensions:

  • cl_intelx_video_enhancement
  • cl_intelx_video_enhancement_color_pipeline
  • cl_intelx_video_enhancement_camera_pipeline

More info on preview features

Built-in functions to work with VEBox. 

Samples: Minimal VEBox Samples

More info: OpenCL Preview Extensions for VEBox 

Gen9

(new in Linux SRB4, not yet available for Windows.)

 

Sharing Extensions

This group of extensions enables interoperability between OpenCL and other APIs using Intel GPUs.

Extension Name Description Supported HW Notes
cl_intel_simultaneous_sharing

The OpenCL 1.2 Extension Spec forbids interoperability with multiple graphics APIs at clCreateContext or clCreateContextFromType  time.  It defines that CL_INVALID_OPERATION should be returned in such cases.

The goal of this extension is to relax the restrictions and allow simultaneous use of API combinations as supported by a given OpenCL device.

More info: Spec 

 
cl_intel_va_api_media_sharing

Linux/Android Media Sharing

More info: Spec

See /content/www/us/en/develop/articles/tutorial-opencl-interoperability-with-video-acceleration-api-on-linux-os.html

Used in Media Server Studio samples

 

cl_intel_d3d11_nv12_media_sharing

cl_intel_dx9_media_sharing

Windows sharing APIs (created before Khronos extensions below.)

See /content/www/us/en/develop/articles/d3d9-media-surface-sharing-between-intel-quick-sync-video-and-opencl-on-intel-hd-graphics.html

Used in Media Server Studio samples

More info:

d3d11 Spec

dx9 Spec

 

cl_khr_dx9_media_sharing

cl_khr_d3d10_sharing

cl_khr_d3d11_sharing

Sharing for DirectX 9, 10, 11

/content/www/us/en/develop/articles/opencl-and-intel-media-sdk.html

More info:

dx9 Spec

d3d10 Spec

d3d11 Spec

 

 

cl_khr_gl_sharing

cl_khr_gl_msaa_sharing

cl_khr_gl_depth_images

cl_khr_gl_event

Sample: https://software.intel.com/sites/default/files/managed/2c/79/intel_ocl_ogl_interop_win.zip 
Related Pages:
/content/www/us/en/develop/articles/opencl-and-opengl-interoperability-tutorial.html

More info:

gl_sharing Spec

gl_msaa_sharing Spec

gl_depth_images Spec

gl_event Spec

 

 

 

 

Subgroups Extensions

Work items in a subgroup can share data without implementing shared local memory or using barriers. This extends the work group concept to allow more efficient data sharing.

Extension Name Description Supported HW Notes
cl_intel_subgroups

Enables work-items in a workgroup to work together let work items share data without local memory and global barriers. Similar to OpenCL 2.0 workgroups.

/content/www/us/en/develop/articles/sgemm-ocl-opt.html

/content/www/us/en/develop/articles/sgemm-for-intel-processor-graphics.html

/content/www/us/en/develop/articles/box-blur-filter-using-intel-subgroup-extensions-in-opencl.html

More info: Spec

 
cl_intel_required_subgroup_size

The goal of this extension is to allow programmers to optionally specify the required subgroup size for a kernel function.  This information is   important for the correctness of many subgroup algorithms, and in some cases may be used by the compiler to generate more optimal code.

More info: Spec

 
cl_khr_subgroups

Implementation controlled division of a workgroup allowing independent forward progress within the workgroup. This feature was promoted to Core in OpenCL 2.1. 

More info: Spec

 
   cl_intel_subgroups_short

Improve the performance of applications operating on 16-bit data types by extending the subgroup functions described in the cl_intel_subgroups extension to support 16-bit integer data types (shorts and ushorts).

More info: Spec

 

 

 

Other Extensions

Extension Name Description Supported HW Notes
cl_intel_accelerator

Basic accelerator support


The accelerator extension consists of a unified set of OpenCL runtime APIs to create, query, and manage the lifetime of objects which represent acceleration processors, engines, or algorithms.

 

More info: 

 

cl_intel_driver_diagnostics

This extension allows the driver to pass additional strings containing diagnostic information. The diagnostic messages can help to understand how the driver works and can provide guidance to modify an application to improve performance.

More info: 

 

 

cl_khr_3d_image_writes

Enables writes to 3D image objects

More info: Spec

 
cl_khr_byte_addressable_store

Removes restrictions of built-in types.  Needed to write to elements of a pointer or struct of type char, uchar, char2, uchar2, short, ushort, and half.

More info: Spec

 
cl_khr_spir

OpenCL Standard Portable Intermediate Representation (SPIR) non source representation of OpenCL.

/content/www/us/en/develop/articles/using-spir-for-fun-and-profit-with-intel-opencl-code-builder.html

More info: Spec

 
cl_khr_fp16

Half-precision floating-point

More info:Spec

 
cl_khr_fp64

IEEE-754 double-precision floating-point support

More info: Spec

 

cl_khr_global_int32_base_atomics

32-bit integer base atomic operations in global memory

More info: Spec

 
cl_khr_global_int32_extended_atomics

32-bit integer extended atomic operations in global memory

More info: Spec

 
cl_khr_icd

Access Khronos OpenCL installable client driver loader (ICD Loader)

More info: Spec

 
cl_khr_image2d_from_buffer

2D image from buffer creation support

More info: 

 

cl_khr_mipmap_image

cl_khr_mipmap_image_writes

Ability to create / read mipmapped images

Adds ability to write mipmapped images, requires cl_khr_mipmap_image

More info:Spec 

 

 

cl_khr_depth_images  

Depth Images

More info: Spec

 
cl_khr_throttle_hints

Extension to OpenCL 2.1 API which allows the driver to implement throttling behavior. Throttling behavior is implementation specific.

More info: Spec

 
cl_khr_local_int32_base_atomics
cl_khr_local_int32_extended_atomics

base_atomics: Spec

extended_atomics: Spec

 

 

Deprecated Extensions

Extension Name Description
cl_intel_ctz Built-in count trailing zeroes

 

1

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserverd for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804