Article

OpenMP 相关技巧

面向英特尔® MIC 架构的编译器方

Authored by AmandaS (Intel) Last updated on 03/21/2019 - 12:00
Article

应用蚁群优化算法 (ACO) 实施交通网络扩展

In this article an OpenMP* based implementation of the Ant Colony Optimization algorithm was analyzed for bottlenecks with Intel® VTune™ Amplifier XE 2016 together with improvements using hybrid MPI-OpenMP and Intel® Threading Building Blocks were introduced to achieve efficient scaling across a four-socket Intel® Xeon® processor E7-8890 v4 processor-based system.
Authored by Sunny G. (Intel) Last updated on 07/05/2019 - 19:13
Article

高效并行化

高效并行化文档

面向英特尔® 集成众核架构的编译器方法

高效并行化

Authored by Ronald W Green (Blackbelt) Last updated on 03/21/2019 - 12:00
Article

使用编译器选项 -opt-threads-per-core 针对每内核的 1-4 个线程进行调度

面向英特尔® MIC 架构的编译器方法

使用编译器选项针对每内核的 1-4 条线程进行调度

Authored by AmandaS (Intel) Last updated on 03/21/2019 - 12:00
Forum topic

Disable kmp_affinity warnings

Hi,

Authored by aurora Last updated on 06/10/2014 - 09:44
Article

使用任务(而非线程)

Tasks are a lightweight alternative to threads that provide faster startup and shutdown times, better load balancing, an efficient use of available resources, and a higher level of abstraction.
Authored by admin Last updated on 07/05/2019 - 09:51
Article

循环修改增强数据并行性能

When confronted with nested loops, the granularity of the computations that are assigned to threads will directly affect performance. Loop transformations such as splitting and merging nested loops can make parallelization easier and more productive.
Authored by admin Last updated on 07/05/2019 - 14:48
Article

设计面向游戏的人工智能(第 4 部分)

The gaming industry has seen great strides in game complexity recently. Game developers are challenged to create increasingly compelling games. This series explores important Artificial Intelligence (AI) concepts and how to optimize them for multi-core.
Authored by admin Last updated on 01/24/2018 - 15:35
Article

英特尔® 至强融核™ 协处理器(代号 “Knights Landing”)— 应用就绪

为了将来在英特尔® 至强™ 处理器和英特尔® 至强融核™ 协处理器(代号 Knights Landing)上实现部分应用就绪,开发人员主要希望从两个方面改进工作负载:

矢量化/代码生成 线程并行性

本文主要讨论矢量化/代码生成,并介绍了一些有用的线程并行工具和资源。

Authored by Last updated on 07/06/2019 - 16:40
Article

粒度与并行性能

One key to attaining good parallel performance is choosing the right granularity for the application. Granularity is the amount of real work in the parallel task. If granularity is too fine, then performance can suffer from communication overhead.
Authored by admin Last updated on 07/05/2019 - 19:53