Building and Running OpenCL™ Applications on Platforms from Intel
- Intel® SDK for OpenCL™ Applications
- Intel® Graphics drivers and OpenCL™ runtime (download)
The drivers and runtime are packaged separately from the SDK to minimize the size of the installation package. They contain what is needed to run applications. The SDK package contains the development components. There are several variations and driver versions, depending on the hardware in your configuration.
Intel® Code Builder for OpenCL™ API
Available in Three Versions:
- Microsoft Visual Studio* plugin
- Eclipse* plugin
- Standalone version
All three versions provide the ability to edit and build OpenCL code, debug OpenCL kernels running on the accelerator and collect runtime performance data. Which version you use depends on the development environment you prefer.
- New: Workflow support allowing build, execution and analysis of applications with several kernels in the project wizard
- Kernel development framework
- Syntax highlighting
- Code auto-completion
- SPIR* and SPIR-V* generation and consumption
- API debugging and call tracing
- Image and memory viewer
- Step-thru kernel debugging
- API call and memory command analysis
- Kernel occupancy and latency analysis
Features and Benefits of the OpenCL™ Platform
SPIR and SPIR-V
Also known as Standard Portable Intermediate Representation, SPIR and its evolution SPIR-V are intermediate languages for parallel compute developed by the Khronos Group*.
In order to provide runtime device-independence, the OpenCL platform relies on just-in-time (JIT) compilation of the kernel source code to the target device. The obvious disadvantage to you is that this offers no protection for your valuable IP. The alternative was to ship a pre-compiled kernel binary for each device that runs your application.
SPIR and SPIR-V provide you with device flexibility while protecting your IP by allowing you to ship device-independent intermediate kernel binaries, which are JIT compiled as needed.
Shared Virtual Memory (SVM)
SVM is a feature in the OpenCL™ 2.0 standard and later, which provides a coherent and consistent shared memory space between the host and the device. In addition to guaranteeing the same address for shared memory between host and device, SVM provides coherency and atomic operations, allowing safe simultaneous access for SVM allocations.
The advantage is that you can write code that makes use of pointer-like data structures, like linked lists or trees shared between the host and the device side of an OpenCL application. You’ll spend less engineering time restructuring your data and you are able to create more sophisticated cooperative kernels that improve overall application performance.
The ability to create shared physical memory buffers, otherwise known as zero copy buffers, is a feature available in the OpenCL™ 1.2 standard and later.
Extra memory copies can be detrimental to application performance when offloading compute to accelerators, like Intel® Graphics Technology. The use of zero copy buffers help simultaneously reduce application memory usage while improving performance by letting the host and device access the same physical memory instead of copying duplicate data between memory spaces.
Related Tools and Libraries
A toolkit for developing and deploying computer vision solutions on platforms from Intel, including smart cameras, robotics, office automation, and more. This SDK supports heterogeneous execution across CPU and SoC accelerators and includes tools that unleash inference performance on deep learning deployment.
Develop enterprise-grade media applications and solutions for data center, cloud and network usages. The Intel SDK for OpenCL™ Applications is bundled in this studio to help you create your own media filters running on Intel® Graphics Technology.
A cross-platform API for developing media applications. Versions are available for Embedded Linux*, Windows*, and Open Source. Get complete control over the media pipeline and programmable access to the Execution Units (EUs) and other graphics processor blocks.
Deliver Intel CPU and GPU performance tuning, multi-core scalability, bandwidth and more. Trace inside OpenCL kernels and get details on OpenCL program activity on the GPU.
Capture and view the distribution of OpenCL program commands (kernels and memory operations) across Intel CPU and GPU.