As I continue to explore different Ultrabook capabilities, in this blog I decided to look into a powerful threading and performance optimization tool for C/C++, .NET, and FORTRAN developers who nee
General Matrix Multiply (GEMM) sample demonstrates how to efficiently utilize an OpenCL™ device to perform general matrix multiply operation on two dense square matrices. The primary target devices that are suitable for this sample are the devices with cache memory: Intel® Xeon Phi™ and Intel® Architecture CPU devices.
This sample illustrates the basic principles of how to work simultaneously with OpenCL™ devices on both CPU and Intel® Processor Graphics.
Features / Description
The sample demonstrates how to implement efficient median filter with OpenCL™ standard. This implementation relies on auto-vectorization performed by Intel® SDK for OpenCL Applications compiler.
Demonstrates how to implement an efficient sorting routine with the OpenCL™ technology that operates on arbitrary input array of integer values. The sample uses properties of bitonic sequence and principles of sorting networks and enables efficient SIMD-style parallelism through OpenCL vector data types. The code is designed to work well on modern CPUs.
Simple Optimizations sample demonstrates simple ways of measuring the performance of OpenCL™ kernels in an application. It describes basics of profiling and important caveats like having dedicated “warming” run. It also demonstrates several simple optimizations, some of optimizations are rather CPU-specific (like mapping buffers), while others are more general (like using relaxed-math). The...
HDR Tone Mapping for Post Processing sample features multi-device support, specifically the simultaneous use of CPU and Intel® Processor Graphics OpenCL™ devices.
Para obter informações mais completas sobre otimizações do compilador, consulte nosso aviso de otimização.