Memory Management Optimizations on the Intel® Xeon Phi™ Coprocessor Using Abstract Vector Register Selection, _mm_malloc, mmap, and PrefetchingRead about software performance optimization for an implementation of a non-library version of DGEMM executing in native mode on the Intel® Xeon Phi™ coprocessor running Linux* OS.
There are two listed below limitations with Intel® Math Kernel Library (Intel® MKL) 11.3 Update 3 which were discovered recently.
Intel® MKL provide a highly optimized and extensively threaded general matrix-matrix multiplication (GEMM) functions. In this article, we explain how to design and measure of the performance using Intel® MKL SGEMM, and outline about 7 tips to help developers to perform the performance test and quickly evaluate the floating pointing computing capability (FLOPS) on specified processor.
For more complete information about compiler optimizations, see our Optimization Notice.