This is going to be a short blog as there is no new material to discuss, just a simple extension of the earlier blogs (see
This paper demonstrates a special version of Caffe* — a deep learning framework originally developed by the Berkeley Vision and Learning Center (BVLC) — that is optimized for Intel® architecture.
See how the new Intel® Advanced Vector Extensions 512CD and the Intel AVX512F subsets (available in the Intel® Xeon Phi processor and in future Intel Xeon processors) lets the compiler automatically generate vector code with no changes to the code.
This case study examines the situation where the problem decomposition is the same for threading as it is for Message Passing Interface* (MPI); that is, the threading parallelism is elevated to the same level as MPI parallelism.
Cython* is a superset of Python* that additionally supports C functions and C types on variable and class attributes. Cython generates C extension modules, which can be used by the main Python program using the import statement.
Matrix multiplication (MM) of two matrices is one of the most fundamental operations in linear algebra. The algorithm for MM is very simple, it could be easily implemented in any programming language. This paper shows that performance significantly improves when different optimization techniques are applied.
This blog post is part of a series that describes my summer school project at CERN openlab.
At the Autodesk University event in Las Vegas, November 14-16, civil and commercial/industrial designers and manufacturers who use Autodesk software came together to see The Future of Making Things.