Best Know Method: Estimating FLOP/s for workloads running on the Intel Xeon® Phi™ coprocessor using Intel® VTune™ Amplifier XE

One of the popular metrics that is frequently used to estimate performance is FLOP/s. This document shares the results of our experience with using Intel VTune Amplifier XE to estimate FLOP/s. 

Authored by Sumedh N. (Intel) Last updated on 06/07/2017 - 10:29
Blog post

Profiling Julia code with Intel® VTune™ Amplifier

[2013 Oct 17: Blog updated to split patch into two patches, one for Intel® VTune™ Amplifier changes and one for MKL/ifort changes.]

Authored by Arch D. Robison (Intel) Last updated on 06/14/2017 - 15:48

Host and Target Platforms Overview

This video describes all the different host and target configurations supported by Intel System Studio to support embedded, mobile, and the Internet of Things.

Authored by Paul F. (Intel) Last updated on 06/14/2017 - 08:45

Understanding How General Exploration Works in Intel® VTune™ Amplifier XE

The General Exploration Analysis Type in Intel® VTune™ Amplifier XE is used to detect microarchitectural hardware bottlenecks in an application or system.

Authored by Jackson Marusarz (Intel) Last updated on 06/07/2017 - 10:32

SGEMM for Intel® Processor Graphics


General Matrix Multiply

Authored by LINGYI K. (Intel) Last updated on 06/07/2017 - 12:20

Cache Miss Rates in Intel® VTune™ Amplifier XE

Intel® VTune™ Amplifier XE has the ability to use Performance Monitoring Units (PMUs) on Intel CPUs to count hardware events and use these events to locate performance issues.

Authored by Jackson Marusarz (Intel) Last updated on 05/24/2018 - 12:50

Overhead and Spin Time Issue in Intel® Threading Building Blocks Applications Due to Inlining

Intel® Threading Building Blocks (Intel TBB) applications may have an incorrectly high amount of Overhead or Spin Time associated with them due to function inlining without corres

Authored by Jackson Marusarz (Intel) Last updated on 05/25/2018 - 15:30