Article
Blog post

Is it time to replace the TOP 500 list?

A new TOP500 list was released at SC12. Is this one-dimensional benchmark still relevant in today's diverse HPC fields of computation?
Authored by Clay B. (Blackbelt) Last updated on 07/06/2019 - 17:00
Article

Using Intel® MPI Library and Intel® Xeon Phi™ Coprocessor Tips

1. Check prerequisites Each host and each Intel® Xeon Phi™ coprocessor should have a unique IP address across a cluster;
Authored by Dmitry S. (Intel) Last updated on 06/14/2019 - 13:39
Blog post

Applying Intel® Threading Building Blocks Observers for Thread Affinity on Intel® Xeon Phi™ Coprocessors

In spite of the fact that the Intel® Threading Building Blocks (Intel® TBB) library [1] [2] provides high-level task based parallelism intended to hide sof

Authored by Alex (Intel) Last updated on 08/01/2019 - 09:30
Blog post

Optimized Pseudo Random Number Generators with AVX2

Intel® Math Kernel Library includes powerful and versatile random number generators that have been optimized to take full advantage of Intel

Authored by gaston-hillar (Blackbelt) Last updated on 07/06/2019 - 17:00
Blog post

Optimizing Big Data processing with Haswell 256-bit Integer SIMD instructions

Big Data requires processing huge amounts of data. Intel Advanced Vector Extensions 2 (aka AVX2) promoted most Intel AVX 128-bits integer SIMD instruction sets to 256-bits.

Authored by gaston-hillar (Blackbelt) Last updated on 07/06/2019 - 17:00
Blog post

Improving MPI Communication between the Intel® Xeon® Host and Intel® Xeon Phi™

MPI Symmetric Mode is widely used in systems equipped with Intel® Xeon Phi™ coprocessors.

Authored by Nguyen, Loc Q (Intel) Last updated on 07/06/2019 - 17:10
Article

Recognize and Measure Vectorization Performance

Get a background on vectorization and learn different techniques to evaluate its effectiveness.
Authored by David M. Last updated on 07/06/2019 - 16:40
Article

Scale-Up Implementation of a Transportation Network Using Ant Colony Optimization (ACO)

In this article an OpenMP* based implementation of the Ant Colony Optimization algorithm was analyzed for bottlenecks with Intel® VTune™ Amplifier XE 2016 together with improvements using hybrid MPI-OpenMP and Intel® Threading Building Blocks were introduced to achieve efficient scaling across a four-socket Intel® Xeon® processor E7-8890 v4 processor-based system.
Authored by Sunny G. (Intel) Last updated on 07/05/2019 - 19:10
Article

Performance of Classic Matrix Multiplication Algorithm on Intel® Xeon Phi™ Processor System

Matrix multiplication (MM) of two matrices is one of the most fundamental operations in linear algebra. The algorithm for MM is very simple, it could be easily implemented in any programming language. This paper shows that performance significantly improves when different optimization techniques are applied.
Authored by Last updated on 06/14/2019 - 11:50