Learn more about an in-depth analysis of code modernization performance conducted by optimizing original CPU code and re-running tests on the latest GPU/CPU hardware.
Modern high performance computers are built with a combination of resources including:
A look into the contents of the two "Pearls" books, edited by James Reinders and Jim Jeffers. These books contain a collection of examples of code modernization.
The Intel® MPI Library and OpenMP* runtime libraries can create affinities between processes or threads, and hardware resources. This affinity keeps an MPI process or OpenMP thread from migrating to a different hardware resource, which can have a dramatic effect on the execution speed of a program.
LAMMPS is an open-source software package that simulates classical molecular dynamics. As it supports many energy models and simulation options, its versatility has made it a popular choice. It was first developed at Sandia National Laboratories to use large-scale parallel computation.
There are two principal methods of parallel computing: distributed memory computing and shared memory computing. As more processor cores are dedicated to large clusters solving scientific and engineering problems, hybrid programming techniques combining the best of distributed and shared memory programs are becoming more popular.
In this article an OpenMP* based implementation of the Ant Colony Optimization algorithm was analyzed for bottlenecks with Intel® VTune™ Amplifier XE 2016 together with improvements using hybrid MPI-OpenMP and Intel® Threading Building Blocks were introduced to achieve efficient scaling across a four-socket Intel® Xeon® processor E7-8890 v4 processor-based system.
Learn how to write an MPI program in Python*, and take advantage of Intel® multicore architectures using OpenMP threads and Intel® AVX512 instructions.
Code Sample included: Learn how to use MPI-3 shared memory feature using the corresponding APIs on the Intel® Xeon Phi™ processor.