In the High Performance Computing (HPC) area, parallel computing techniques such as MPI, OpenMP*, one-sided communications, shmem, and Fortran coarray are widely utilized. This blog is part of a series that will introduce the use of these techniques, especially how to use them on the Intel® Xeon Phi™ coprocessor. This first blog discusses the main usage of the hybrid MPI/OpenMP model.
By Erik Niemeyer (Intel Corporation) and Ken Strandberg (Catlow Communications*)
This article looks at several books that introduce developers to the topics of Message Passing Interface (MPI), parallel programming, and OpenMP*.
Learn how to write an MPI program in Python*, and take advantage of Intel® multicore architectures using OpenMP threads and Intel® AVX512 instructions.
The Intel® MPI Library and OpenMP* runtime libraries can create affinities between processes or threads, and hardware resources. This affinity keeps an MPI process or OpenMP thread from migrating to a different hardware resource, which can have a dramatic effect on the execution speed of a program.
In this article an OpenMP* based implementation of the Ant Colony Optimization algorithm was analyzed for bottlenecks with Intel® VTune™ Amplifier XE 2016 together with improvements using hybrid MPI-OpenMP and Intel® Threading Building Blocks were introduced to achieve efficient scaling across a four-socket Intel® Xeon® processor E7-8890 v4 processor-based system.
In the previous episodes of this chapter, we learned how to use vectorization to parallelize calculations across vector lanes in each core.