This article looks at several books that introduce developers to the topics of Message Passing Interface (MPI), parallel programming, and OpenMP*.
In the past couple of years I've noticed a trend to "re-invent" technology or re-brand old ideas and concepts from previous computing generations.
This paper demonstrates a special version of Caffe* — a deep learning framework originally developed by the Berkeley Vision and Learning Center (BVLC) — that is optimized for Intel® architecture.
One of the Intel® Modern Code Developer Challenge winners, Daniel Falguera, describes many of the optimizations he implemented and why some didn't work.
Cython* is a superset of Python* that additionally supports C functions and C types on variable and class attributes. Cython generates C extension modules, which can be used by the main Python program using the import statement.
Intel® PCC at the Hartree Centre is to enable UK academic and industrial codes to exploit the parallel and energy.
Use these parallel programming resources and books with your Intel® Xeon® processor and Intel® Xeon Phi™ processor family
Matrix multiplication (MM) of two matrices is one of the most fundamental operations in linear algebra. The algorithm for MM is very simple, it could be easily implemented in any programming language. This paper shows that performance significantly improves when different optimization techniques are applied.
This case study examines the situation where the problem decomposition is the same for threading as it is for Message Passing Interface* (MPI); that is, the threading parallelism is elevated to the same level as MPI parallelism.
As I mentioned in my previous post about writing a vectorized reduction code from Intel vector intrinsics, that part of the code was just the finishing touch on a loop computing squared difference of complex values.