Tips to measure the performance of Intel® MKL with small matrix sizes

Authored by Ying H (Intel)
The time required by the first MKL call should be ignored for the perfromance measurements. The first MKL call has overhead due to buffer allocation and thread initialization. Ignoring the first MKL call gives more consistent times for small problems. Last updated on 09/24/2013 - 20:43

A Matrix Multiplication Routine that Updates Only the Upper or Lower Triangular Part of the Result Matrix

Authored by Zhang Z (Intel)
Background

Intel® MKL provides the general purpose BLAS*  matrix multiply routines ?GEMM defined as follows:

Last updated on 09/06/2013 - 18:26