Code Sample

Introduction to the Heterogeneous Streams Library

To efficiently utilize all available resources for the task concurrency application on heterogeneous platforms, designers need to understand the memory architecture, the thread utilization on each platform, the pipeline to offload the workload to different platforms. To relieve designers of the burden of implementing the necessary infrastructures, the Heterogeneous Streaming (hStreams) library provides a set of well-defined APIs to support a task-based parallelism model on heterogeneous platforms
  • Professional
  • Professors
  • Students
  • Linux*
  • Microsoft Windows* 10
  • Modern Code
  • Server
  • Intel® Many Integrated Core Architecture
  • Intel® Software Guard Extensions Part 4: Design an Enclave

    In part 4 of this tutorial series, you'll create the project infrastructure necessary to integrate the enclave into your application. Source code is included.
  • Microsoft Windows* 10
  • Microsoft Windows* 8.x
  • Business Client
  • Windows*
  • Intel® Software Guard Extensions (Intel® SGX)
  • Security
  • Fine-Tuning Vectorization and Memory Traffic on Intel® Xeon Phi™ Coprocessors: LU Decomposition of Small Matrices

    Common techniques for fine-tuning the performance of automatically vectorized loops in applications for Intel® Xeon Phi™ coprocessors are discussed. These techniques include strength reduction, regularizing the vectorization pattern, data alignment and aligned data hint, and pointer disambiguation.
  • Professional
  • Professors
  • Students
  • Linux*
  • Modern Code
  • C/C++
  • Advanced
  • Intermediate
  • Intel® Streaming SIMD Extensions
  • Education
  • Intel® Many Integrated Core Architecture
  • Parallel Computing
  • Vectorization
  • Multithreaded Transposition of Square Matrices with Common Code for Intel® Xeon® Processors and Intel® Xeon Phi™ Coprocessors

    In-place matrix transposition, a standard operation in linear algebra, is a memory bandwidth-bound operation. The theoretical maximum performance of transposition is the memory copy bandwidth. However, due to non-contiguous memory access in the transposition operation, practical performance is usually lower. The ratio of the transposition rate to the memory copy bandwidth is a measure of the transposition algorithm efficiency.
  • Professional
  • Professors
  • Students
  • Linux*
  • Modern Code
  • C/C++
  • Intermediate
  • OpenMP*
  • Intel® Many Integrated Core Architecture
  • OpenGL* Performance Tips: Textures Have Better Rendering Performance than Images

    This article discusses why using a texture rather than an image can improve OpenGL rendering performance. It is accompanied by a simple C++ application that alternates between using a texture and using an image. The purpose of this application is to show the effect on rendering performance (milliseconds per frame) when using the two techniques.
  • Game Development
  • OpenGL*
  • 没有任何秘密的 API:Vulkan* 简介第 0 部分:前言

    Follow Pawel L. to learn about Intel's graphic driver support for the emerging Vulkan* graphics API. He'll be providing several tutorials along with Github source code.
  • Linux*
  • Microsoft Windows* 10
  • Microsoft Windows* 8.x
  • Android*
  • Game Development
  • Windows*
  • C/C++
  • Beginner
  • Intermediate
  • Vulkan API
  • api
  • OpenGL*
  • Graphics
  • Subscribe to Code Sample