Learn how to write an MPI program in Python*, and take advantage of Intel® multicore architectures using OpenMP threads and Intel® AVX512 instructions.
In this article an OpenMP* based implementation of the Ant Colony Optimization algorithm was analyzed for bottlenecks with Intel® VTune™ Amplifier XE 2016 together with improvements using hybrid MPI-OpenMP and Intel® Threading Building Blocks were introduced to achieve efficient scaling across a four-socket Intel® Xeon® processor E7-8890 v4 processor-based system.
Fine-Tuning Optimization for a Numerical Method for Hyperbolic Equations Applied to a Porous Media Flow Problem with Intel® ToolsThis paper presents an analysis for potential optimization for a Godunov-type semi-discrete central scheme, for a particular hyperbolic problem implicated in porous media flow, using OpenMP* and Intel® Advanced Vector Extensions 2.
Are you working with a hybrid program that just isn't performing? Do you feel like your application is on life support?
For general advice on beginning use of Intel MKL visit the Getting Started Tutorial
Matrix multiplication (MM) of two matrices is one of the most fundamental operations in linear algebra. The algorithm for MM is very simple, it could be easily implemented in any programming language. This paper shows that performance significantly improves when different optimization techniques are applied.
Intel(R) MKL 2017 Update 4 ( September 2017)
This case study examines the situation where the problem decomposition is the same for threading as it is for Message Passing Interface* (MPI); that is, the threading parallelism is elevated to the same level as MPI parallelism.