Best Know Method: Estimating FLOP/s for workloads running on the Intel Xeon® Phi™ coprocessor using Intel® VTune™ Amplifier XE
One of the popular metrics that is frequently used to estimate performance is FLOP/s. This document shares the results of our experience with using Intel VTune Amplifier XE to estimate FLOP/s.
[2013 Oct 17: Blog updated to split patch into two patches, one for Intel® VTune™ Amplifier changes and one for MKL/ifort changes.]
This video describes all the different host and target configurations supported by Intel System Studio to support embedded, mobile, and the Internet of Things.
The General Exploration Analysis Type in Intel® VTune™ Amplifier XE is used to detect microarchitectural hardware bottlenecks in an application or system.
General Matrix Multiply
Intel® VTune™ Amplifier XE has the ability to use Performance Monitoring Units (PMUs) on Intel CPUs to count hardware events and use these events to locate performance issues.
Intel® Threading Building Blocks (Intel TBB) applications may have an incorrectly high amount of Overhead or Spin Time associated with them due to function inlining without corres