Article

How to Manipulate Data Structure to Optimize Memory Use on 32-Bit Intel® Architecture

Demonstrates how a Structure of Arrays organization of data makes it easier to get a performance benefit from SIMD
Authored by admin Last updated on 02/05/2019 - 10:23
Article

Optimizing Game Architectures with Intel® Threading Building Blocks

This article describes techniques of optimizing game architectures that already have some threading and shows how Intel TBB can enhance the performance of these architectures with relatively small amounts of coding effort.
Authored by Last updated on 08/01/2019 - 09:30
Article

Using Intel® MKL in your Python* program

Some instructions and a simple example showing how to call Intel® MKL from Python*,
Authored by TODD R. (Intel) Last updated on 12/10/2018 - 13:29
Article

Distributed Memory Coarray Programs with Process Pinning

This article describes a method to compile and run a distributed memory coarray program using Intel® Parallel Studio XE Cluster Edition for Linux . An example using Linux* is presented.
Authored by Kenneth Craft (Intel) Last updated on 07/08/2019 - 14:58
Article

Performance Benefits of Half Precision Floats

Half precision floats are 16-bit floating-point numbers, which are half the size of traditional 32-bit single precision floats, and have lower precision and smaller range.

Authored by Patrick Konsor (Intel) Last updated on 07/10/2019 - 17:05
Article

Threading Intel® Integrated Performance Primitives Image Resize with Intel® Threading Building Blocks

Threading Intel® IPP Image Resize with Intel® TBB.pdf (157.18 KB) :
Authored by Jeffrey M. (Intel) Last updated on 07/31/2019 - 15:05
Blog post

最快线程间数据交换算法,有效避免锁竞争 -- TwoQueues

处理多线程数据共享问题注意的几个要点:

1、锁竞争:尽量减少锁竞争的时间和次数。

2、内存:尽量是使用已分配内存,减少内存分配和释放的次数。尽量是用连续内存,减少共享占用的内存量。

多线程数据交换简单方案A:

定义一个list,再所有操作list的地方进行加锁和解锁。

简单模拟代码:

Authored by Last updated on 07/04/2019 - 21:30
Article

Measuring performance in HPC

This is the first article in a series of articles about High Performance Computing with the Intel® Xeon Phi™ coprocessor.

Authored by Last updated on 07/06/2019 - 16:10
Article

Vectorizing Loops with Calls to User-Defined External Functions

Introduction

Authored by Anoop M. (Intel) Last updated on 12/12/2018 - 18:00
Article

Improve Intel® MKL Performance for Small Problems: The Use of MKL_DIRECT_CALL

One of the big new features introduced in the Intel® Math Kernel Library (Intel® MKL) 11.2 is the greatly improved performance for small problem sizes.

Authored by Zhang, Zhang (Intel) Last updated on 07/07/2019 - 10:35