Part one of this three-part series focuses on thread parallelism and race conditions, and discusses using mutexes in OpenMP* to resolve race conditions.
A look into the contents of the two "Pearls" books, edited by James Reinders and Jim Jeffers. These books contain a collection of examples of code modernization.
There are two principal methods of parallel computing: distributed memory computing and shared memory computing. As more processor cores are dedicated to large clusters solving scientific and engineering problems, hybrid programming techniques combining the best of distributed and shared memory programs are becoming more popular.
This is the second article in a series of articles about High Performance Computing with the Intel Xeon Phi.
This paper demonstrates a special version of Caffe* — a deep learning framework originally developed by the Berkeley Vision and Learning Center (BVLC) — that is optimized for Intel® architecture.
The goal of the N-Body problem is to predict the motion of a set of n objects interacting with each other by some force, e.g. the gravitational force. N-Body simulations have been used in particles simulation such as astrophysical and molecular dynamics simulations. There are a number of approaches for solving the N-Body problem, such as the Barnes-Hut algorithm, the Fast Multipole method, the...
Yet Another Stencil Kernel (YASK), is a framework to facilitate design exploration and tuning of HPC kernels including vector folding, cache blocking, memory layout, loop construction, temporal wave-front blocking, and others.YASK contains a specialized source-to-source translator to convert scalar C++ stencil code to SIMD-optimized code.
One of the Intel® Modern Code Developer Challenge winners, Daniel Falguera, describes many of the optimizations he implemented and why some didn't work.
Cython* is a superset of Python* that additionally supports C functions and C types on variable and class attributes. Cython generates C extension modules, which can be used by the main Python program using the import statement.
A 3-part educational series on Optimization Techniques for the Intel® MIC Architecture is provided by Colfax Research. The series focuses on select topics on optimization of applications for Intel’s multi-core and manycore architectures (Intel® Xeon® processors and Intel® Xeon Phi™ processors).