Robert Ioffe describes a consistent series of optimizations that improve OpenCL kernel performance on Intel®
Iris™ Graphics or Intel® Iris™ Pro Graphics using Intel® SDK for OpenCL™ Applications 2013. The optimizations we describe are general in nature; developers could apply them to a broad set of
OpenCL™ kernels. After studying the optimizations presented here, the developers will know the fundamentals of mastery
of Intel® Iris™ Graphics for compute purposes. We start with a simple Modulate kernel.
Robert Ioffe is a Technical Consulting Engineer at Intel’s Software and Solutions Group. He is an expert in OpenCL programming and OpenCL workload optimization on Intel Iris and Intel Iris Pro Graphics with deep knowledge of Intel Graphics Hardware. He was heavily involved in Khronos standards work, focusing on prototyping the latest features and making sure they can run well on Intel architecture. Most recently he has been working on prototyping Nested Parallelism (enqueue_kernel functions) feature of OpenCL 2.0 and wrote a number of samples that demonstrate Nested Parallelism functionality, including GPU-Quicksort for OpenCL 2.0. He also recorded and released two Optimizing Simple OpenCL Kernels videos and a third video on Nested Parallelism.
You might also be interested in the following:
A laptop or a workstation with the 4th Generation Intel® Core™ Processor
OpenCL™ Drivers and Runtimes for Intel® Architecture
Intel® SDK for OpenCL™ Applications
Intel® VTune™ Amplifier XE 2013
For more info on Intel Processor Graphics