Blog post

Hybrid MPI and OpenMP* Model

In the High Performance Computing (HPC) area, parallel computing techniques such as MPI, OpenMP*, one-sided communications, shmem, and Fortran coarray are widely utilized. This blog is part of a series that will introduce the use of these techniques, especially how to use them on the Intel® Xeon Phi™ coprocessor. This first blog discusses the main usage of the hybrid MPI/OpenMP model.
Authored by Nguyen, Loc Q (Intel) Last updated on 07/06/2019 - 17:10
Article

Finding Non-trivial Opportunities for Parallelism in Existing Serial Code using OpenMP*

By Erik Niemeyer (Intel Corporation) and Ken Strandberg (Catlow Communications*)

Authored by Erik Niemeyer (Intel) Last updated on 07/06/2019 - 16:49
Article

Books - Message Passing Interface (MPI)

This article looks at several books that introduce developers to the topics of Message Passing Interface (MPI), parallel programming, and OpenMP*.
Authored by Mike P. (Intel) Last updated on 12/12/2018 - 18:00
Article

Code Sample: Exploring MPI for Python* on Intel® Xeon Phi™ Processor

Learn how to write an MPI program in Python*, and take advantage of Intel® multicore architectures using OpenMP threads and Intel® AVX512 instructions.
Authored by Nguyen, Loc Q (Intel) Last updated on 07/06/2019 - 16:30
Article

Process and Thread Affinity for Intel® Xeon Phi™ Processors

The Intel® MPI Library and OpenMP* runtime libraries can create affinities between processes or threads, and hardware resources. This affinity keeps an MPI process or OpenMP thread from migrating to a different hardware resource, which can have a dramatic effect on the execution speed of a program.
Authored by Gregg S. (Intel) Last updated on 07/29/2019 - 08:05
Article

Scale-Up Implementation of a Transportation Network Using Ant Colony Optimization (ACO)

In this article an OpenMP* based implementation of the Ant Colony Optimization algorithm was analyzed for bottlenecks with Intel® VTune™ Amplifier XE 2016 together with improvements using hybrid MPI-OpenMP and Intel® Threading Building Blocks were introduced to achieve efficient scaling across a four-socket Intel® Xeon® processor E7-8890 v4 processor-based system.
Authored by Sunny G. (Intel) Last updated on 07/05/2019 - 19:10
File Wrapper

Parallel Universe Magazine - Issue 16, November 2013

Authored by admin Last updated on 12/12/2018 - 18:08
Video

Part 9: Distributed-Memory Parallelism and MPI

In the previous episodes of this chapter, we learned how to use vectorization to parallelize calculations across vector lanes in each core.

Authored by admin Last updated on 03/21/2019 - 12:00
File Wrapper

Parallel Universe Magazine - Issue 18, June 2014

Authored by admin Last updated on 05/16/2019 - 11:39
File Wrapper

Parallel Universe Magazine - Issue 24, March 2016

Authored by admin Last updated on 12/12/2018 - 18:08