Learn how to write an MPI program in Python*, and take advantage of Intel® multicore architectures using OpenMP threads and Intel® AVX512 instructions.
In this article an OpenMP* based implementation of the Ant Colony Optimization algorithm was analyzed for bottlenecks with Intel® VTune™ Amplifier XE 2016 together with improvements using hybrid MPI-OpenMP and Intel® Threading Building Blocks were introduced to achieve efficient scaling across a four-socket Intel® Xeon® processor E7-8890 v4 processor-based system.
There are two principal methods of parallel computing: distributed memory computing and shared memory computing. As more processor cores are dedicated to large clusters solving scientific and engineering problems, hybrid programming techniques combining the best of distributed and shared memory programs are becoming more popular.
LAMMPS is an open-source software package that simulates classical molecular dynamics. As it supports many energy models and simulation options, its versatility has made it a popular choice. It was first developed at Sandia National Laboratories to use large-scale parallel computation.
This case study examines the situation where the problem decomposition is the same for threading as it is for Message Passing Interface* (MPI); that is, the threading parallelism is elevated to the same level as MPI parallelism.
This document is designed to help users get started writing code and running MPI applications using the Intel® MPI Library on a development platform that includes the Intel® Xeon Phi™ processor.
Modern high performance computers are built with a combination of resources including:
Code Sample included: Learn how to use MPI-3 shared memory feature using the corresponding APIs on the Intel® Xeon Phi™ processor.