Article

OpenCL* Device Fission for CPU Performance

Download Article

Download OpenCL* Device Fission for CPU Performance [PDF 762KB]  

Authored by Last updated on 05/31/2019 - 14:10
Article

OpenCL* Design and Programming Guide for the Intel® Xeon Phi™ Coprocessor

About this document
Authored by admin Last updated on 07/06/2019 - 16:30
Article

Workshop: Optimizing OpenCL applications for Intel® Xeon Phi™ Coprocessor

The Intel® Xeon Phi™ Coprocessor is designed for highly parallel, high performance demanding applications.

Authored by Arik Narkis (Intel) Last updated on 07/06/2019 - 16:30
Article

OpenCL™ Platform/Device Capabilities Viewer Sample

Download for Windows*

Authored by Last updated on 05/31/2019 - 14:10
Article

General Matrix Multiply Sample

General Matrix Multiply (GEMM) sample demonstrates how to efficiently utilize an OpenCL™ device to perform general matrix multiply operation on two dense square matrices. The primary target devices that are suitable for this sample are the devices with cache memory: Intel® Xeon Phi™ and Intel® Architecture CPU devices.
Authored by Last updated on 05/31/2019 - 14:40
Article

Intel® SDK for OpenCL* Applications XE 2013 Release Notes

Intel® SDK for OpenCL* Applications XE 2013 Release Notes Content
Authored by Jeffrey Mott (Intel) Last updated on 05/31/2019 - 14:10
Article

Median Filter

The sample demonstrates how to implement efficient median filter with OpenCL™ standard. This implementation relies on auto-vectorization performed by Intel® SDK for OpenCL Applications compiler.
Authored by Last updated on 05/31/2019 - 14:40
Article

HDR Rendering with God Rays Using OpenCL™ Technology

This sample demonstrates a CPU-optimized implementation of the God Rays effect, showing how to: Implement calculation kernels using the OpenCL™ technology C99 Parallelize the kernels by running several work-groups in parallel Organize data exchange between the host and the OpenCL device
Authored by Last updated on 05/31/2019 - 14:10
Article

Bitonic Sorting

Demonstrates how to implement an efficient sorting routine with the OpenCL™ technology that operates on arbitrary input array of integer values. The sample uses properties of bitonic sequence and principles of sorting networks and enables efficient SIMD-style parallelism through OpenCL vector data types. The code is designed to work well on modern CPUs.
Authored by Last updated on 05/31/2019 - 14:40
Article

Simple Optimizations of OpenCL™ Code

Simple Optimizations sample demonstrates simple ways of measuring the performance of OpenCL™ kernels in an application. It describes basics of profiling and important caveats like having dedicated “warming” run. It also demonstrates several simple optimizations, some of optimizations are rather CPU-specific (like mapping buffers), while others are more general (like using relaxed-math). The...
Authored by Last updated on 05/31/2019 - 14:10