Background
Intel® MKL provides the general purpose BLAS* matrix multiply routines ?GEMM defined as follows:
C := alpha*op(A)*op(B) + beta*C
where alpha and beta are scalars, op(A) is an m-by-k matrix, op(B) is a k-by-n matrix, C is an m-by-n matrix, with op(X) being either X, or XT, or XH.
