A Matrix Multiplication Routine that Updates Only the Upper or Lower Triangular Part of the Result Matrix


Intel® MKL provides the general purpose BLAS*  matrix multiply routines ?GEMM defined as follows:

Optimization of Data Read/Write in a Parallel Application

(This work was done by Vivek Lingegowda during his internship at Intel.)

Go Parallel

This is a first post in a series of posts about parallel programming with

Improving Averaging Filter Performance Using Intel® Cilk™ Plus

Intel® Cilk™ Plus is an extension to the C and C++ languages to support data and task parallelism.  It provides three new keywords to i

Introduction to OpenMP* on YouTube*

Tim Mattson (Intel) has authored an extensive series of excellent videos as in introduction to OpenMP*.

Reduce Boilerplate Code in Parallelized Loops with C++11 Lambda Expressions

Parallelize loops with Intel® Threading Building Blocks using Intel® C++ Compiler for lambda expressions.
How to Install the Python* Version of Intel® Data Analytics Acceleration Library (Intel® DAAL)

Intel® Data Analytics Acceleration Library (Intel® DAAL) is a software solution that offers building blocks covering all the stages of data analytics, from preprocessing to decision making. The beta version of Intel DAAL 2017 provides support for the Python* language.
