Blog post

Mixing MPI and OpenMP*, Hugging Hardware and Dealing With It

This morning, I took a rare break, and attended a tutorial at Supercomputing.  I'm glad I did.

Authored by James R. (Blackbelt) Last updated on 10/15/2019 - 19:21
Article

Finding Non-trivial Opportunities for Parallelism in Existing Serial Code using OpenMP*

By Erik Niemeyer (Intel Corporation) and Ken Strandberg (Catlow Communications*)

Authored by Erik Niemeyer (Intel) Last updated on 10/15/2019 - 16:40
Article

Process and Thread Affinity for Intel® Xeon Phi™ Processors

The Intel® MPI Library and OpenMP* runtime libraries can create affinities between processes or threads, and hardware resources. This affinity keeps an MPI process or OpenMP thread from migrating to a different hardware resource, which can have a dramatic effect on the execution speed of a program.
Authored by Gregg S. (Intel) Last updated on 10/15/2019 - 15:30
Article

应用蚁群优化算法 (ACO) 实施交通网络扩展

In this article an OpenMP* based implementation of the Ant Colony Optimization algorithm was analyzed for bottlenecks with Intel® VTune™ Amplifier XE 2016 together with improvements using hybrid MPI-OpenMP and Intel® Threading Building Blocks were introduced to achieve efficient scaling across a four-socket Intel® Xeon® processor E7-8890 v4 processor-based system.
Authored by Sunny G. (Intel) Last updated on 10/15/2019 - 16:40
Article

Scale-Up Implementation of a Transportation Network Using Ant Colony Optimization (ACO)

In this article an OpenMP* based implementation of the Ant Colony Optimization algorithm was analyzed for bottlenecks with Intel® VTune™ Amplifier XE 2016 together with improvements using hybrid MPI-OpenMP and Intel® Threading Building Blocks were introduced to achieve efficient scaling across a four-socket Intel® Xeon® processor E7-8890 v4 processor-based system.
Authored by Sunny G. (Intel) Last updated on 10/15/2019 - 16:40
File Wrapper

Parallel Universe Magazine - Issue 16, November 2013

Authored by admin Last updated on 12/12/2018 - 18:08
Video

Part 9: Distributed-Memory Parallelism and MPI

In the previous episodes of this chapter, we learned how to use vectorization to parallelize calculations across vector lanes in each core.

Authored by admin Last updated on 10/15/2019 - 15:50
Video

第 9 集:分布式内存并行化和 MPI

在本章上一集中,我们学习了如何使用矢量在每个内核的矢量平面间并行化计算。 然后,我们讨论了如何使用 OpenMP 在每颗处理器或协处理器的内核间扩展应用。 接下来,在本章最后一集 4.9 集中,我们将研究下一级别的并行化:在多台计算设备和集群环境的多个计算节点间扩展。

Authored by Last updated on 10/15/2019 - 15:50
File Wrapper

Parallel Universe Magazine - Issue 18, June 2014

Authored by admin Last updated on 05/16/2019 - 11:39
File Wrapper

Parallel Universe Magazine - Issue 24, March 2016

Authored by admin Last updated on 12/12/2018 - 18:08