When: Saturday, June 13th, 2015 (9am to 12:30pm)
Where: At PLDI in Portland, OR, USA
Since 2nd Generation Intel® Core Processors most processors come with Intel® HD Graphics on-chip to provide high performance graphics without the added cost and power of a discrete graphics add-in card. When an application doesn’t rely on or fully utilize display and graphics, for example an embedded application, the GPU can be used to offload parallel computations taking advantage of both thread parallelism and vector instructions for data parallelism. A workload can either run completely on the GPU or it can be partitioned between CPU and GPU. One advantage of the on-chip processor graphics is physical memory shared between CPU and GPU thus providing no-copy overhead in sharing data between the CPU and the GPU.
The C/C++ Cilk Plus parallel programming model, with small extensions for offload, is used to take advantage of the computational capabilities of the GPU. The C++ compiler handles all aspects of both the host and target side compilation including setting up data to be shared and kernel parameters. This provides an easy heterogeneous programming model that is similar between host and target and lets the programmer focus on the algorithm and performance.
The Tutorial format will be a presentation on the programming model as well as demonstrations on selected examples. The tutorial will give insight into parallel programming using Cilk™ Plus, offload to processor graphics, as well as tuning and debugging. Specific examples will be shown on how to port an application to take advantage of offload.
Knud J. Kirkegaard is a Principal Engineer in the Intel’s Mobile Computing and Compilers group. He currently works as architect on the C/C++ compiler with Cilk™ Plus supporting heterogeneous computing on Intel® Graphics Technology. Since he joined Intel, he has worked on scalar optimizations, interprocedural optimizations, profile guided optimizations, and Cilk™ Plus. His current interests are in parallel computing, heterogeneous computing, optimized C++ code, and compiler architecture. He has an M.S. degree in Information and Control Systems Engineering from Aalborg University, Denmark. His e-mail is email@example.com.
Anoop Madhusoodhanan Prabha is a Software Engineer in Intel's Software and Services Group. He currently works as a Technical Consulting Engineer on the C/C++ compiler support team. He joined Intel on 1st August 2009. Since he joined Intel, he has worked on optimizing various customer applications by enabling multi-threading and vectorization. He has experience working with OpenMP, Cilk™ Plus, TBB, CUDA etc. His current interest are in Processor and GPU architecture, heterogeneous computing and high performance computing . He has an M.S. degree in Electrical Engineering from State University of New York at Buffalo, US. His e-mail is firstname.lastname@example.org
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804