Article

Optimization Techniques for the Intel® MIC Architecture: Part 1 of 3

Part one of this three-part series focuses on thread parallelism and race conditions, and discusses using mutexes in OpenMP* to resolve race conditions.
Authored by Mike P. (Intel) Last updated on 10/15/2019 - 16:40
Article

Books - High Performance Parallelism Pearls

A look into the contents of the two "Pearls" books, edited by James Reinders and Jim Jeffers. These books contain a collection of examples of code modernization.
Authored by Mike P. (Intel) Last updated on 09/30/2019 - 17:30
Article

Hybrid Parallelism: Parallel Distributed Memory and Shared Memory Computing

There are two principal methods of parallel computing: distributed memory computing and shared memory computing. As more processor cores are dedicated to large clusters solving scientific and engineering problems, hybrid programming techniques combining the best of distributed and shared memory programs are becoming more popular.
Authored by David M. Last updated on 10/15/2019 - 16:40
Article

Choosing the right threading framework

This is the second article in a series of articles about High Performance Computing with the Intel Xeon Phi.

Authored by Last updated on 10/15/2019 - 16:40
Article

Caffe* Optimized for Intel® Architecture: Applying Modern Code Techniques

This paper demonstrates a special version of Caffe* — a deep learning framework originally developed by the Berkeley Vision and Learning Center (BVLC) — that is optimized for Intel® architecture.
Authored by Last updated on 10/15/2019 - 15:30
Blog post

N-Body Simulation Project at Cal Poly

The goal of the N-Body problem is to predict the motion of a set of n objects interacting with each other by some force, e.g. the gravitational force. N-Body simulations have been used in particles simulation such as astrophysical and molecular dynamics simulations. There are a number of approaches for solving the N-Body problem, such as the Barnes-Hut algorithm, the Fast Multipole method, the...
Authored by Nguyen, Loc Q (Intel) Last updated on 09/30/2019 - 16:50
Article

Case Study: Optimized Code for Neural Cell Simulations

One of the Intel® Modern Code Developer Challenge winners, Daniel Falguera, describes many of the optimizations he implemented and why some didn't work.
Authored by Last updated on 10/03/2019 - 07:55
Article

Thread Parallelism in Cython*

Cython* is a superset of Python* that additionally supports C functions and C types on variable and class attributes. Cython generates C extension modules, which can be used by the main Python program using the import statement.
Authored by Nguyen, Loc Q (Intel) Last updated on 10/15/2019 - 16:40
Blog post

A Guide to Optimization Techniques for the Intel® MIC Architecture

A 3-part educational series on Optimization Techniques for the Intel® MIC Architecture is provided by Colfax Research. The series focuses on select topics on optimization of applications for Intel’s multi-core and manycore architectures (Intel® Xeon® processors and Intel® Xeon Phi™ processors).
Authored by Iman S. (Intel) Last updated on 10/15/2019 - 15:50
Article

Optimizing Memory Bandwidth on Stream Triad

Download Article
Authored by Karthik Raman (Intel) Last updated on 10/03/2019 - 09:13