How to configure OpenMP in the Intel IPP library to maximize multi-threaded performance of the Intel IPP primitives.
OpenMP 5.0 is the next version of the OpenMP specification which should be officially released in 2018.
In this article an OpenMP* based implementation of the Ant Colony Optimization algorithm was analyzed for bottlenecks with Intel® VTune™ Amplifier XE 2016 together with improvements using hybrid MPI-OpenMP and Intel® Threading Building Blocks were introduced to achieve efficient scaling across a four-socket Intel® Xeon® processor E7-8890 v4 processor-based system.
Fine-Tuning Optimization for a Numerical Method for Hyperbolic Equations Applied to a Porous Media Flow Problem with Intel® ToolsThis paper presents an analysis for potential optimization for a Godunov-type semi-discrete central scheme, for a particular hyperbolic problem implicated in porous media flow, using OpenMP* and Intel® Advanced Vector Extensions 2.
Hybrid parallel computing architectures have the potential to speed up computationally intensive applications, but the efficient utilization of heterogeneous resources is challenging. This case study aims to describe an optimization technique applied to "ProFrager", a protein structure and function prediction tool developed at “Laboratório Nacional de Computação Científica - LNCC” (National...
This paper demonstrates a special version of Caffe* — a deep learning framework originally developed by the Berkeley Vision and Learning Center (BVLC) — that is optimized for Intel® architecture.
This article explores what happens when Intel solutions support functional and logic programming languages that are regularly used for Artificial Intelligence (AI) and proposes a Prolog interpreter recompilation using Intel® C++ Compiler and libraries in order to evaluate their contribution to logic based AI.
Yet Another Stencil Kernel (YASK), is a framework to facilitate design exploration and tuning of HPC kernels including vector folding, cache blocking, memory layout, loop construction, temporal wave-front blocking, and others.YASK contains a specialized source-to-source translator to convert scalar C++ stencil code to SIMD-optimized code.
Cython* is a superset of Python* that additionally supports C functions and C types on variable and class attributes. Cython generates C extension modules, which can be used by the main Python program using the import statement.
Matrix multiplication (MM) of two matrices is one of the most fundamental operations in linear algebra. The algorithm for MM is very simple, it could be easily implemented in any programming language. This paper shows that performance significantly improves when different optimization techniques are applied.