Blog post

Hybrid MPI and OpenMP* Model

In the High Performance Computing (HPC) area, parallel computing techniques such as MPI, OpenMP*, one-sided communications, shmem, and Fortran coarray are widely utilized. This blog is part of a series that will introduce the use of these techniques, especially how to use them on the Intel® Xeon Phi™ coprocessor. This first blog discusses the main usage of the hybrid MPI/OpenMP model.
Authored by Nguyen, Loc Q (Intel) Last updated on 07/06/2019 - 17:10
Article

Finding Non-trivial Opportunities for Parallelism in Existing Serial Code using OpenMP*

By Erik Niemeyer (Intel Corporation) and Ken Strandberg (Catlow Communications*)

Authored by Erik Niemeyer (Intel) Last updated on 07/06/2019 - 16:49
Article

Books - Message Passing Interface (MPI)

This article looks at several books that introduce developers to the topics of Message Passing Interface (MPI), parallel programming, and OpenMP*.
Authored by Mike P. (Intel) Last updated on 12/12/2018 - 18:00
Article

Code Sample: Exploring MPI for Python* on Intel® Xeon Phi™ Processor

Learn how to write an MPI program in Python*, and take advantage of Intel® multicore architectures using OpenMP threads and Intel® AVX512 instructions.
Authored by Nguyen, Loc Q (Intel) Last updated on 07/06/2019 - 16:30
Article

Process and Thread Affinity for Intel® Xeon Phi™ Processors

The Intel® MPI Library and OpenMP* runtime libraries can create affinities between processes or threads, and hardware resources. This affinity keeps an MPI process or OpenMP thread from migrating to a different hardware resource, which can have a dramatic effect on the execution speed of a program.
Authored by Gregg S. (Intel) Last updated on 07/29/2019 - 08:05
Article

应用蚁群优化算法 (ACO) 实施交通网络扩展

In this article an OpenMP* based implementation of the Ant Colony Optimization algorithm was analyzed for bottlenecks with Intel® VTune™ Amplifier XE 2016 together with improvements using hybrid MPI-OpenMP and Intel® Threading Building Blocks were introduced to achieve efficient scaling across a four-socket Intel® Xeon® processor E7-8890 v4 processor-based system.
Authored by Sunny G. (Intel) Last updated on 07/05/2019 - 19:13
Article

Scale-Up Implementation of a Transportation Network Using Ant Colony Optimization (ACO)

In this article an OpenMP* based implementation of the Ant Colony Optimization algorithm was analyzed for bottlenecks with Intel® VTune™ Amplifier XE 2016 together with improvements using hybrid MPI-OpenMP and Intel® Threading Building Blocks were introduced to achieve efficient scaling across a four-socket Intel® Xeon® processor E7-8890 v4 processor-based system.
Authored by Sunny G. (Intel) Last updated on 07/05/2019 - 19:10
File Wrapper

Parallel Universe Magazine - Issue 16, November 2013

Authored by admin Last updated on 12/12/2018 - 18:08
Video

Part 9: Distributed-Memory Parallelism and MPI

In the previous episodes of this chapter, we learned how to use vectorization to parallelize calculations across vector lanes in each core.

Authored by admin Last updated on 03/21/2019 - 12:00
Video

第 9 集:分布式内存并行化和 MPI

在本章上一集中,我们学习了如何使用矢量在每个内核的矢量平面间并行化计算。 然后,我们讨论了如何使用 OpenMP 在每颗处理器或协处理器的内核间扩展应用。 接下来,在本章最后一集 4.9 集中,我们将研究下一级别的并行化:在多台计算设备和集群环境的多个计算节点间扩展。

Authored by Last updated on 04/26/2019 - 04:06