A Matrix Multiplication Routine that Updates Only the Upper or Lower Triangular Part of the Result MatrixBackground
Intel® MKL provides the general purpose BLAS* matrix multiply routines ?GEMM defined as follows:
(This work was done by Vivek Lingegowda during his internship at Intel.)
This is a first post in a series of posts about parallel programming with
Code size optimization is a key factor, especially critical in embedded systems requiring code size reduction at the cost of application speed!
Unlike a lot of previous recent blogs, this series is about power management in general. At the very end of the series, I’ll write specifically about the Intel® Xeon Phi™ coprocessor.
Power management policy has evolved over the years.
How about the future? Have we reached the pinnacle of power management?
One of the big new features introduced in the Intel® Math Kernel Library (Intel® MKL) 11.2 is the greatly improved performance for small problem sizes.
Tim Mattson (Intel) has authored an extensive series of excellent videos as in introduction to OpenMP*.