Memory Management Optimizations on the Intel® Xeon Phi™ Coprocessor Using Abstract Vector Register Selection, _mm_malloc, mmap, and Prefetching

Read about software performance optimization for an implementation of a non-library version of DGEMM executing in native mode on the Intel® Xeon Phi™ coprocessor running Linux* OS.
Authored by Steve H. (Intel) Last updated on 07/18/2017 - 10:50

Intel® MKL 11.3.3 patch

There are two listed below limitations with Intel® Math Kernel Library (Intel® MKL) 11.3 Update 3 which were discovered recently.

Authored by Gennady F. (Intel) Last updated on 06/07/2017 - 12:05

Intel(R) Math Kernel Library - Introducing Vectorized Compact Routines

Authored by Gennady F. (Intel) Last updated on 09/12/2017 - 01:26

Tips to Measure the Performance of Matrix Multiplication Using Intel® MKL

Intel® MKL provide a highly optimized and extensively threaded general matrix-matrix multiplication (GEMM) functions. In this article, we explain how to design and measure of the performance using Intel® MKL SGEMM, and outline about 7 tips to help developers to perform the performance test and quickly evaluate the floating pointing computing capability (FLOPS) on specified processor.
Authored by Ying H. (Intel) Last updated on 01/15/2018 - 19:58

Optimize Matrix Operations using the Intel® Math Kernel Library (Intel® MKL)

Video tutorial illustrating performance optimization using Intel® Math Kernel Library (Intel® MKL). Calls the DGEMM routine and compares against it a triply nested loop to illustrate the performance.
Authored by Markus W. (Intel) Last updated on 02/09/2018 - 12:28
For more complete information about compiler optimizations, see our Optimization Notice.