fpga-interaction
Hyper-V
.NET* code analysis
[pipeline slots]
[uops]
*.csv file
*.csv file
Examples of CSV Format and Imported Data
Create a CSV File with External Data
examples
format requirements
*.perf file
*.pwr file
% of 128-bit
% of 256-bit
% of packed fp instructions
% of packed SIMD instructions
% of scalar fp instructions
% of scalar simd instructions
4k aliasing
action-options
actions
actions modified
add
Add Administrative Privileges
administrative privileges
external collection data to VTune Amplifier
external data to VTune Amplifier
administrative privileges
Add Administrative Privileges
add
advanced hardware-level analysis
Memory Usage View
Microarchitecture Exploration View
interpret results
Memory Usage View
Microarchitecture Exploration View
algorithm analysis
Analyze Performance
Hotspots Analysis Group
allocate memory API
allocations
allow
Allow Multiple Runs or Multiplex Events
multiple runs
allow-multiple-runs amplxe-cl option
amplxe-cl
Launch Intel® VTune™ Amplifier
Intel® VTune™ Amplifier Command Line Interface
about
call-stack-mode
csv-delimiter
options
reports
save report
specify search directories
syntax
typical usage
usage scenarios
amplxe-cl options
allow-multiple-runs
analyze-system
app-working-dir
call-stack-mode
collect
collect-with
column
command
cpu-mask
csv-delimiter
cumulative-threshold-percent
custom-collector
data-limit
discard-raw-data
duration
filter
finalize
format
group-by
help
import
inline-mode
limit
mrte-mode
no-follow-child
trace-mpi
quiet
report
report-knob
report-output
report-width
result-dir
resume-after
return-app-exitcode
ring-buffer
search-dir
show-as
sort-asc
sort-desc
source-object
source-search-dir
stack-size
start-paused
strategy
no-summary
target-duration-type
target-install-dir
target-pid
target-process
target-system
target-tmp-dir
no-unplugged-mode
user-data-dir
verbose
version
allow-multiple-runs
analyze-system
app-working-dir
call-stack-mode
collect
collect-with
column
command
cpu-mask
csv-delimiter
cumulative-threshold-percent
custom-collector
data-limit
discard-raw-data
duration
filter
finalize
format
group-by
help
import
inline-mode
limit
mrte-mode
no-follow-child
no-trace-mpi
quiet
report
report-knob
report-output
report-width
result-dir
resume-after
return-app-exitcode
ring-buffer
search-dir
show-as
sort-asc
sort-desc
source-object
source-search-dir
stack-size
start-paused
strategy
summary
target-duration-type
target-install-dir
target-pid
target-process
target-system
target-tmp-dir
trace-mpi
unplugged-mode
user-data-dir
verbose
version
amplxe-cl options
finalization-mode
finalization-mode
amplxe-cl reports
Callstacks Report
gprof-cc Report
GPU Analysis Report
Hotspots Report
Hardware Events Report
Summary Report
Top-down Report
callstacks
gprof-cc
gpu-computing-tasks
hotspots
hw-events
summary
top-down
amplxe-feedback
amplxe-gui
Launch Intel® VTune™ Amplifier
Standalone VTune Amplifier Graphical Interface
analysis
Analyze Performance
hotspots
hpc characterization
Microarchitecture exploration
system overview
threading
troubleshooting
types
analysis system
Analysis Target window
analysis types
analyze
.NET* Targets
Microarchitecture Analysis Group
Hotspots Analysis for CPU Usage Issues
Analyze Interrupts
Java* Code Analysis
Analyze Loops
MPI Code Analysis
GPU Application Analysis on Intel® HD Graphics and Intel® Iris® Graphics
Analyze Statically Linked Binaries on Linux* Targets
Collect System-Wide Data from Command Line
Task Analysis
Threading Analysis
.NET code
hardware issues
hotspots
interrupts
Java* code
loops
MPI applications
Processor Graphics
statically linked binaries
system from command line
tasks
threading
analyze
Python* Code Analysis
Python* code
analyze-kvm-guest amplxe-cl option
analyze-system amplxe-cl option
Android application
Prepare an Android* Application for Analysis
prepare for analysis
Android*
Analyze Unplugged Devices
remote analysis
search directories
APIs
System APIs Supported by Intel® VTune™ Amplifier
allocate memory
domain
frame
load module
pause/resume
string handle
synchronization
task
thread naming
app-working-dir amplxe-cl option
application
WHAT: Analysis Target
output
arbitrary targets
configure
architecture diagram
Architecture Diagram tab
assembly
Source Code Analysis
context menu
toolbar
assembly grouping menu
Toolbar: Source/Assembly
Group and Filter Data
assists
atomics
available core time
Available Core Time
Available Core Time
average
Window: Summary - Hotspots
CPU usage
average bandwidth
average cpu frequency
average cpu usage
average latency (cycles)
average logical core utilization
average physical core utilization
average time
back-end bound
Bad Speculation (Back-End Bound Pipeline Slots)
Back-End Bound
baclears
bad speculation
band height
bandwidth utilization
Bandwidth window
bar
barrier-to-barrier segment
Basic Hotspots analysis
Hotspots Analysis for CPU Usage Issues
interpret results
Binary/Symbol Search button
binary/symbol search dialog box
Bottom-up
Manage Data Views
comparison
synchronize with Top-down Tree window
branch mispredict
branch resteers
build
Debug Information for Windows* Application Binaries
sampling drivers
bus lock
c-state
c-state metric
C# application analysis
cache bound
cache misses analysis
cache source files
call stack
Manage Data Views
context menu
display mode
mode
call stacks
call tree
call-stack-mode amplxe-cl option
Caller/Callee window
callstacks
Callstacks Report
amplxe-cl reports
cancel analysis
Cancel button
cannot find file
Change Focus Function option
change stack layout
choose
Set Up Analysis Target
Choose Data Format
analysis target
data format
Choose Analysis button
Choose Target button
Clear Log button
clears resteers
clockticks per instructions retired
clockticks vs. pipeline slots
code profiling scenarios
Collapse All option
collect
Hardware Event-based Sampling Collection
Hardware Event-based Sampling Collection with Stacks
Highly Accurate CPU Time Data Collection
Allow Multiple Runs or Multiplex Events
User-Mode Sampling and Tracing Collection
hardware event-based sampling data
hardware event-based stack sampling data
highly accurate CPU time
precise data
user-mode sampling and tracing data
collect action
Pause Collection from Command Line
pause
collect amplxe-cl option
collect
advanced-hotspots
cpugpu-concurrency
hotspots
memory-access
threading
uarch-exploration
collect data
Android* Target Analysis from the Command Line
on Android
collect-with amplxe-cl option
collection
Analyze Performance
hardware event-based sampling
hardware event-based stack sampling
log
user-mode sampling and tracing
collection
Limit Data Collection
data limit
Collection Log window
Collection time
collector
Hardware Event-based Sampling Collection with Stacks
Hardware Event-based Sampling Collection
Window: Collection Log
User-Mode Sampling and Tracing Collection
event-based sampling
Hardware Event-based Sampling Collection with Stacks
Hardware Event-based Sampling Collection
messages
user-mode sampling and tracing
column amplxe-cl option
command
Toolbar: Configure Analysis
toolbar
command amplxe-cl option
command line
Intel® VTune™ Amplifier Command Line Interface
OpenMP* Analysis from the Command Line
amplxe-cl Actions
Option Descriptions and General Rules
Difference Report
Command Line Usage Scenarios
interface
OpenMP* analysis
options
rules
scenario for VTune Amplifier
usage scenarios
Command Line button
communication (mpi)
compare
Window: Compare Results
bottom-up data
results
Window: Compare Results
Compare Results
source
summary data
top-down trees
Compare button
Compare Results window
compiler switches
Compiler Switches for Performance Analysis on Windows* Targets
Compiler Switches for Performance Analysis on Linux* Targets
for performance analysis
Compiler Switches for Performance Analysis on Windows* Targets
Compiler Switches for Performance Analysis on Linux* Targets
compiling and linking with ITT APIs
compute basic GPU metrics
compute extended GPU metrics
compute-intensive application analysis
computing task purpose
computing threads started
computing threads started, threads/sec
Concurrency analysis
Concurrency viewpoint
configure
Attaching ITT APIs to a Launched Application
Manage Grid Views
arbitrary targets
Bottom-up/Top-down Tree windows
custom analysis
events
for attaching to a process
for ITT API
Microsoft* symbol server
options from the command line
sample after value
temporary directory
configure analysis
configuring your environment for using ITT APIs
container targets
contested accesses
contested accesses (intra-tile)
context menu
Context Menus: Source/Assembly Window
for call stack tab
for grid
for project navigator
for source/assembly window
for Timeline pane
context summary
contribution
conventions
copy
Custom Analysis
existing analysis type configuration
Copy Command Line to Clipboard dialog box
copy from current
Copy to Clipboard option
core bound
core frequency
Core Wake-ups window
Correlate Metrics window
counters API
counts
CPI
cpi rate
CPI Rate
CPI Rate (Intel Atom® processor)
CPU C/P States window
CPU metrics
cpu time
CPU Time
CPU Time
CPU Time
CPU time
Warnings about Accurate CPU Time Collection
highly accurate detection
cpu utilization
CPU Utilization
CPU Utilization (OpenMP)
CPU utilization
Window: Summary - HPC Performance Characterization
HPC Performance Characterization View
CPU Utilization
cpu-mask amplxe-cl option
CPU/FPGA Interaction analysis
CPU/FPGA Interaction Analysis (Preview)
interpret results
CPU/FPGA Interaction viewpoint
CPU/GPU Concurrency analysis
create
Custom Analysis
new analysis
Create a Project dialog box
creation
critical error
csv file
import
import
csv format
Context Menus: Source/Assembly Window
export
csv-delimiter amplxe-cl option
cumulative data
Create a CSV File with External Data
import as csv
cumulative-threshold-percent amplxe-cl option
custom analysis
create
custom collector
custom-collector amplxe-cl option
customize grouping
cycles of 0 ports utilized
cycles of 1 port utilized
cycles of 2 ports utilized
cycles of 3+ ports utilized
d0ix state metric
d0ix-state
data
Control Data Collection
Choose Data Format
collection
format
data collection limit
Limit Data Collection from Command Line
from the command line
data sharing
data-limit amplxe-cl options
debug information
Debug Information for Windows* Application Binaries
Enable Linux* Kernel Analysis
Debug Information for Windows* System Libraries
for application binaries
for Linux kernels
for system libraries
debug information
Debug window
detached device
dialog box
Dialog Box: Binary/Symbol Search
Generate Command Line Configuration from GUI
Set Up Project
Dialog Box: Source Search
binary/symbol search
Copy Command Line to Clipboard
Create a Project
source search
difference
directory
discard-raw-data amplxe-cl option
disclaimer
Disk Input and Output analysis
System Disk IO Data View
interpret data
Disk Input and Output viewpoint
divider
documentation
domain API
Domain drop-down menu
double
download
Debug Information for Windows* Application Binaries
symbols
dram bandwidth bound
dram bound
DRAM self refresh
DRAM self refresh metric
driverless sampling collection
drivers
Sampling Drivers
Driverless Event-Based Sampling Collection
build
for Android
for FreeBSD
Linux*
DSB coverage
dsb switches
dtlb overhead
dtlb store overhead
duration
Manage Analysis Duration from Command Line
manage from the command line
duration amplxe-cl option
Eclipse* integration
effective cpu utilization
effective physical core utilization
effective time
elapsed time
Elapsed Time
Elapsed Time
Elapsed Time
embedded device
Configure Yocto Project* and Intel® VTune™ Amplifier with the Linux* Target Package
Configure Yocto Project*/Wind River* Linux* and Intel® VTune™ Amplifier with the Intel System Studio Integration Layer
Embedded Linux* Targets
Configure Yocto Project* and Intel® VTune™ Amplifier with the VTune Amplifier Integration Layer
energy analysis
Energy Analysis
energy analysis
Energy Analysis Metrics Reference
metrics
energy consumed
energy consumed metric
Enter index keyword
Platform Profiler Setup (Preview)
QNX* Targets
Viewing Instrumentation and Tracing Technology (ITT) API Task Data in Intel® VTune™ Amplifier
Window: Summary - Platform Power Analysis
Targets in a Cloud Environment
Profile Targets on a VMware* Guest System
error messages
Error Message: Root Privileges Required for Processor Graphics Events
Error Message: Symbol File Is Not Found
application sets its own handler for signal
cannot load data file
cannot open data
in Intel VTune Amplifier
Warnings about Accurate CPU Time Collection
Troubleshooting
no pre-built driver exists for this system
PMU resources may be made unavailable either by the BIOS or Hypervisor
problem accessing the sampling driver
required key not available
stack size is too small
error messages
estimated bb execution count
estimated ideal time
eu 2 fpu pipelines active
eu array active
eu array idle
eu array stalled
eu array stalled/idle
eu ipc rate
eu send pipeline active
eu threads occupancy
event
Hardware Event List
Allow Multiple Runs or Multiplex Events
Hardware Event Skid
add
multiplexing
skid
event API
Event Count window
event counts
event library
event reference
Intel Processor Events Reference
Getting Help
event-based sampling
event-based sampling
Memory Usage View
Microarchitecture Exploration View
interpret results
Memory Usage View
Microarchitecture Exploration View
event-based stack sampling
examples
Examples of CSV Format and Imported Data
imported data
execution stalls
Expand All option
Expand Selected Rows option
Export to CSV option
external collection data
Import External Data
add to VTune Amplifier
external collector
false sharing
far branch
fb full
features
feedback
filenames
filter
Toolbar: Filter
Group and Filter Data
Bottom-up/Top-down Tree windows
command line reports
toolbar
filter amplxe-cl option
Filter In by Selection option
Filter Out by Selection option
finalization
Finalization
Window: Collection Log
finalization-mode amplxe-cl option
finalize
Finalization
results
finalize amplxe-cl option
find
find data
Find option
flags merge stalls
FLOPS
HPC Performance Characterization Analysis
Window: Summary - HPC Performance Characterization
HPC Performance Characterization View
follow-child amplxe-cl option
format
format amplxe-cl option
formatting
Save and Format Command Line Reports
amplxe-cl reports
fp arithmetic
fp arithmetic/memory read instructions ratio
fp arithmetic/memory write instructions ratio
fp assists
fp scalar
fp vector
fp x87
fpu utilization
frame
Frame Data Analysis
interpret
frame API
frame rate
Change Threshold Values
change threshold
frames
User-Mode Sampling and Tracing Collection
guessed
skipped
unknown
FreeBSD*
Set Up FreeBSD* System
remote analysis
FreeBSD* target
FreeBSD* target
from command line
advanced-hotspots Command Line Analysis
concurrency Command Line Analysis
cpugpu-concurrency Command Line Analysis
uarch-exploration Command Line Analysis
gpu-hotspots Command Line Analysis
gpu-profiling Command Line Analysis
hotspots Command Line Analysis
hpc-performance Command Line Analysis
io Command Line Analysis
locksandwaits Command Line Analysis
memory-access Command Line Analysis
sgx-hotspots Command Line Analysis
system-overview Command Line Analysis
threading Command Line Analysis
tsx-exploration Command Line Analysis
tsx-hotspots Command Line Analysis
Advanced Hotspots analysis
Concurrency analysis
CPU/GPU Concurrency analysis
General Exploration analysis
GPU Hotspots analysis
GPU In-kernel Profiling analysis
Hotspots analysis
HPC Performance Characterization analysis
Input and Output analysis
Locks and Waits analysis
memory-access analysis
SGX Hotspots analysis
System Overview analysis
Threading analysis
TSX Exploration analysis
TSX Hotspots analysis
front-end bandwidth
front-end bandwidth dsb
front-end bandwidth lsd
front-end bandwidth mite
front-end bound
front-end latency
Ftrace* events
General Exploration viewpoint
general options
General pane
general retirement
get started
getting started
global
global-options
global/local memory accesses
Go* applications support
gprof-cc
gprof-cc Report
report
GPU analysis
GPU Application Analysis on Intel® HD Graphics and Intel® Iris® Graphics
Configure GPU Analysis from Command Line
interpret data
run from CLI
gpu eu array usage
GPU Hotspots analysis
GPU Hotspots viewpoint
GPU in-kernel profiling
GPU L3 Bound
gpu l3 miss ratio
gpu l3 misses
gpu l3 misses, misses/sec
gpu memory read bandwidth, gb/sec
gpu memory texture read bandwidth, gb/sec
gpu memory write bandwidth, gb/sec
GPU metrics
gpu texel quads count, count/sec
GPU usage
Configure GPU Analysis from Command Line
configure analysis from CLI
gpu utilization
GPU Utilization
GPU Utilization
gpu-computing-tasks
GPU Analysis Report
report
gpu-rendering
granularity
Graphics C/P States window
Graphics window
Window: Graphics - GPU Compute/Media Hotspots
Window: Graphics - Hotspots
grid
Manage Grid Views
controls
group
Filter and Group Command Line Reports
command line reports
group by
Group and Filter Data
Manage Grid Views
group-by amplxe-cl option
guessed frames
guessed stack frames
guest system profiling
hardware event counts
hardware event reference
hardware event sample count
hardware event skid
hardware event-based analysis
hardware event-based sampling analysis
Hardware Event-based Sampling Collection
interpret
Memory Usage View
Microarchitecture Exploration View
hardware event-based sampling analysis from command line
hardware event-based stack sampling analysis
hardware events
Hardware Event List
add to analysis
Hardware Events viewpoint
Hardware Issues viewpoint
heap profiler
help
Tutorials and Samples
Product Website and Support
Getting Help
help amplxe-cl option
Hide Column option
highly accurate CPU time collection
histogram
Frame Data Analysis
frame rate
histograms
host system
hot keys
Introduction
Hot Keys
Control Data Collection
hotspots
Hotspots Report
report
Hotspots analysis
Hotspots by CPU Usage viewpoint
Hotspots by CPU Utilization viewpoint
Window: Summary - Hotspots by CPU Utilization
Summary window
Hotspots by Thread Concurrency
Window: Summary - Hotspots by Thread Concurrency
Summary window
Hotspots by Thread Concurrency viewpoint
Hotspots viewpoint
Switch Viewpoints
Hotspots View
hottest path
HPC Characterization analysis
HPC Performance Characterization Analysis
about
HPC performance characterization
HPC Performance Characterization analysis
HPC Performance Characterization View
interpret results
HPC Performance Characterization viewpoint
HPC performance metrics
hw-events
Hardware Events Report
report
i/o wait time
icache line fetch
icache misses
ideal time
idle wake-ups
iJIT_ GetNewMethodID
iJIT_IsProfilingActive
iJIT_NotifyEvent
imbalance
imbalance or serial spinning
Imbalance or Serial Spinning
Imbalance or Serial Spinning
import
View Command Line Results in the GUI
Import Results from Command Line
Import Results and Traces into VTune Amplifier GUI
amplxe-cl result
View Command Line Results in the GUI
Import Results from Command Line
from VTune Amplifier GUI
Intel SoC Watch result
import amplxe-cl option
Import button
import data files
import external data
Create a CSV File with External Data
examples
Import from CSV button
inactive sync wait count
inactive sync wait time
inactive time
inactive wait count
inactive wait time
inactive wait time with poor cpu utilization
incoming bandwidth bound
incoming packet rate bound
incorrect
Problem: Unknown Frames
stack
inline functions
inline mode
inline-mode amplxe-cl option
Input and Output analysis
insmod-sep
install
Sampling Drivers
sampling drivers
install VTune Amplifier
Set Up Linux* System for Remote Analysis
Set Up Remote Windows* Target
on the target system
Set Up Linux* System for Remote Analysis
Set Up Remote Windows* Target
installation directories
installation guide
instance count
instance count
instantaneous data
Create a CSV File with External Data
import as csv
instruction starvation
instructions retired event
Instrumenting with ITT APIs
integrate
Microsoft Visual Studio* Integration
to Microsoft Visual Studio*
Intel Atom processor
Microarchitecture Exploration View
interpret analysis results
Intel Energy Profiler
Intel Graphics driver is obsolete
Intel Media SDK program analysis
Intel MIC Architecture analysis
Intel microarchitecture code name Ivy Bridge
Microarchitecture Exploration View
interpret analysis results
Intel SoC Watch
Intel TBB analysis
Intel Xeon Phi processor analysis
Intel® Xeon Phi™ Processor Targets
about
Intel® VTune™ Amplifier
Intel® VTune™ Amplifier Command Line Interface
command line interface
Intel® Graphics Driver
Intel® VTune™ Amplifier
Introduction
Menu: Intel VTune Amplifier
Standalone VTune Amplifier Graphical Interface
about
menu
standalone interface
Intel® VTune™ Amplifier
Toolbar: VTune Amplifier
toolbar
interpret
CPU/FPGA Interaction View
System Disk IO Data View
Interpreting Energy Analysis Data with Intel® VTune™ Amplifier
Memory Usage View
Microarchitecture Exploration View
Frame Data Analysis
GPU OpenCL™ Application Analysis
Hotspots View
HPC Performance Characterization View
Analyze Interrupts
Interpreting OpenMP* Analysis Data
Platform Profiler Results (Preview)
Task Analysis
Threading Efficiency View
CPU/FPGA Interaction viewpoint
disk input and output data
energy analysis data
event-based sampling analysis results
Memory Usage View
Microarchitecture Exploration View
frame data
GPU analysis data
Hotspots viewpoint
HPC Performance Characterization viewpoint
interrupts
OpenMP analysis data
Platform Profiler Results
task analysis data
Threading Efficiency viewpoint
interrupt time
interrupts
Analyze Interrupts
analyze
interval
introduction
Introduction
to Intel® VTune™ Amplifier
ipc
itlb overhead
ITT API
Configuring Your Build System
Attaching ITT APIs to a Launched Application
Instrumenting Your Application
Basic Usage and Configuration
build system configuration
compiling and linking
configuring your environment
instrumentation
prerequisites
ITT API
ITT API attach
ITT API overhead
Java* code analysis
Set Up FreeBSD* System
enable on Android
from command line
Java* code analysis
JavaScript application analysis
JIT Profiling API
JIT Profiling API
Using JIT Profiling API
about
usage
JIT Profiling API
kernel modules resolution
kernel stacks
key features
KVM guest OS profiling
kvm-guest-kallsyms amplxe-cl option
kvm-guest-modules amplxe-cl option
l1 bound
l1 hit rate
l1d replacement percentage
l1d replacements
l1i stall cycles
l2 bound
l2 hit bound
l2 hit rate
l2 hw prefetcher allocations
l2 input requests
l2 miss bound
l2 miss count
l2 replacement percentage
l2 replacements
l3 bound
l3 latency
l3 sampler bandwidth, gb/sec
l3 shader bandwidth, gb/sec
launch
Microsoft Visual Studio* Integration
Standalone VTune Amplifier Graphical Interface
environment
VTune Amplifier
Microsoft Visual Studio* Integration
Standalone VTune Amplifier Graphical Interface
Launch Application target type
legal information
length changing prefixes
limit
Limit Data Collection from Command Line
Limit Data Collection
data collection from the command line
data collection size
limit amplxe-cl option
Linux targets
llc hit
llc load misses serviced by remote dram
llc miss
llc miss count
LLC miss rate due GPU lookups
llc miss ratio due gpu lookups
llc replacement percentage
llc replacements
load
Sampling Drivers
sampling drivers
load imbalance
load module API
loads blocked by store forwarding
local
local dram
local dram access count
local persistent memory
local target system
locations
lock contention
lock contention (openmp)
lock latency
Locks and Waits analysis
Locks and Waits viewpoint
Switch Viewpoints
Threading Efficiency View
logical core utilization
loop analysis
loop entry count
Loop Mode menu
loop type
loop-mode amplxe-cl option
loopback interface
LSD coverage
machine clears
macOS
macOS* host
manage
Manage Analysis Duration from Command Line
Manage Timeline View
analysis duration from the command line
timeline view
managed code profiling mode
Mark Timeline button
max dram single-package bandwidth
max dram system bandwidth
mcdram bandwidth bound
mcdram cache bandwidth bound
mcdram flat bandwidth bound
memory access analysis
Memory Access analysis
memory bandwidth
Memory Bandwidth
Memory Bandwidth
memory bandwidth analysis
memory bound
memory consumption
memory efficiency
memory latency
Memory Usage
Memory Usage View
viewpoint
Memory Usage viewpoint
Switch Viewpoints
Summary window
memory-consumption
menu
Menu: Intel VTune Amplifier
customize grouping
Intel VTune Amplifier
metric
metrics
MUX Reliability
C-State
Contested Accesses
Local DRAM
MCDRAM Bandwidth Bound
Timer Resolution
MCDRAM Flat Bandwidth Bound
Pipeline Slots
OpenMP Region Time
Effective CPU Utilization
L2 Replacements
SP GFLOPS
Effective Time
Average CPU Frequency
SIMD Compute-to-L2 Access Ratio
Inactive Wait Time
Inactive Time
Ratio to Max Bandwidth, %
Length Changing Prefixes
CPI Rate
L2 Miss Count
Wait Time
Port 3
L1D Replacements
LLC Miss Count
Instance Count
FP x87
Local Persistent Memory
GPU L3 Misses, Misses/sec
Divider
Reduction
Vector Instruction Set
Bad Speculation (Back-End Bound Pipeline Slots)
Tasking
Split Stores
CPU Time
L1 Bound
FP Assists
Front-End Bandwidth MITE
CPU Time
Total Time in C0 State
SIMD Assists
Atomics
Typed Memory Write Bandwidth, GB/sec
Creation
Communication (MPI)
L2 Hit Bound
ITLB Overhead
MS Entry
Cycles of 0 Ports Utilized
Port 5
% of 256-bit Packed Floating Point Instructions
Elapsed Time
GPU Memory Write Bandwidth, GB/sec
Memory Efficiency
OpenMP* Analysis. Collection Time
Average Time
GPU L3 Bound
Sampler Busy
Estimated BB Execution Count
EU Array Active
Port Utilization
CPU Frequency
Remote Cache Access Count
SIMD Compute-to-L1 Access Ratio
GPU EU Array Usage
Pre-Decode Wrong
Far Branch
Bad Speculation (Cancelled Pipeline Slots)
Port 2
Average Latency (cycles)
Inactive Wait Count
SMC Machine Clear
Clears Resteers
FP Arithmetic/Memory Write Instructions Ratio
UTLB Overhead
Store Latency
LLC Load Misses Serviced By Remote DRAM
Total Time in Non-C0 States
L3 Latency
Remote DRAM
Store Bound
GPU Utilization
Ideal Time
DRAM Bound
Remote / Local DRAM Ratio
Core Bound
Size
Typed Writes Coalescence
Inactive Sync Wait Count
EU 2 FPU Pipelines Active
FPU Utilization
DSB Switches
Front-End Bandwidth DSB
Bus Lock
Computing Threads Started, Threads/sec
Front-End Latency
Port 4
Page Walk
Incoming Bandwidth Bound
Cycles of 2 Ports Utilized
Energy Consumed (mJ)
Elapsed Time
MO Machine Clear Overhead
Mispredicts Resteers
DRAM Self Refresh
S0ix States
SP FLOPs per Cycle
Lock Contention
Local
L2 Hit Rate
Split Loads
Loop Type
False Sharing
Effective Physical Core Utilization
Imbalance or Serial Spinning
Cycles of 1 Port Utilized
BACLEARS
Ratio to Max Bandwidth, %
Available Core Time
Typed Memory Read Bandwidth, GB/sec
Slow LEA Stalls
Wake-ups
Untyped Memory Read Bandwidth, GB/sec
Cache Bound
Instruction Starvation
FP Scalar
GPU L3 Misses
Total Time
Paused Time
% of Scalar FP Instructions
Task Time
GPU Memory Read Bandwidth, GB/sec
D0ix States
Port 0
Imbalance or Serial Spinning
Loop Entry Count
EU Array Idle
Untyped Memory Write Bandwidth, GB/sec
DRAM Bandwidth Bound
EU Threads Occupancy
Shared Local Memory Write Bandwidth, GB/sec
Memory Latency
L1D Replacement Percentage
Other
Wait Rate
L2 Replacement Percentage
Contested Accesses (Intra-Tile)
L2 Input Requests
Total Time in S0 State
Render/GPGPU Command Streamer Loaded
Machine Clears
Other
Elapsed Time
LLC Hit
Remote Cache
LLC Replacement Percentage
ICache Line Fetch
FP Vector
Port 7
VPU Utilization
MS Switches
GPU L3 Miss Ratio
Shared Local Memory Read Bandwidth, GB/sec
OpenMP* Potential Gain
SQ Full
[uOps]
Logical Core Utilization
CPI Rate (Intel Atom® processor)
FB Full
L3 Bound
NUMA: % of Remote Accesses
GPU Memory Texture Read Bandwidth, GB/sec
SIMD Width
MPI Imbalance
L3 Shader Bandwidth, GB/sec
Typed Reads Coalescence
Parallel Region Time
Port 1
Memory Bandwidth
Lock Contention
LLC Miss Ratio due GPU Lookups
SIMD Instructions per Cycle
Incoming Packet Rate Bound
Branch Mispredict
Vector Capacity Usage
Memory Bandwidth
Flags Merge Stalls
Other
L3 Sampler Bandwidth, GB/sec
Total Wake-up Count
Max DRAM Single-Package Bandwidth
Estimated Ideal Time
Microcode Sequencer
% of Packed FP Instructions
CPU Utilization
% of Packed SIMD Instructions
Global
Front-End Bandwidth LSD
Sampler Is Bottleneck
Memory Bound
Total, GB/sec
GPU Texel Quads Count, Count/sec
Scheduling
General Retirement
Wake-ups/sec per Core
Inactive Sync Wait Time
P-State
Outgoing Packet Rate Bound
MPI Busy Wait Time
MS Assists
L1 Hit Rate
I/O Wait Time
Untyped Reads Coalescence
Max DRAM System Bandwidth
EU Array Stalled/Idle
L1I Stall Cycles
Port 6
Occupancy
Computing Threads Started
Loads Blocked by Store Forwarding
Retiring
% of Scalar SIMD Instructions
ICache Misses
Cycles of 3+ Ports Utilized
Front-End Bandwidth
Reference
Average Bandwidth
SP FLOPs per Cycle
Average CPU Usage
Remote DRAM Access Count
Average Logical Core Utilization
4K Aliasing
Front-End Bound
DTLB Overhead
EU IPC Rate
Total Iteration Count
Untyped Writes Coalescence
Ratio to Max Bandwidth, %
Average Physical Core Utilization
MPI Rank on the Critical Path
Spin Time
L2 Miss Bound
Inactive Wait Time with poor CPU Utilization
Idle Wake-ups
Imbalance
Branch Resteers
Wait Count
Thread Oversubscription
Assists
Overhead Time
LLC Miss
IPC
L2 Bound
Retire Stalls
EU Send pipeline active
EU Array Stalled
FP Arithmetic
Outgoing Bandwidth Bound
% of 128-bit Packed Floating Point Instructions
Back-End Bound
FP Arithmetic/Memory Read Instructions Ratio
L2 HW Prefetcher Allocations
Local DRAM Access Count
MCDRAM Cache Bandwidth Bound
CPU Time
LLC Replacements
Data Sharing
Lock Latency
DTLB Store Overhead
GPU Utilization
Execution Stalls
Serial Time (outside parallel regions)
Available Core Time
CPU Utilization (OpenMP)
CPI
event counts
event sample count
overhead time
spin time
metrics
Energy Analysis Metrics Reference
energy analysis
microarchitecture analysis
Analyze Performance
Microarchitecture Analysis Group
Microarchitecture exploration analysis
Microarchitecture Exploration viewpoints
Microarchitecture Usage
microcode sequencer
minimize collection overhead
mispredicts resteers
mo machine clear overhead
modifiers
MPI application analysis
MPI Busy Wait Time
mpi imbalance
mpi rank on the critical path
mrte-mode amplxe-cl option
ms assists
ms entry
ms switches
mux reliability
NC Device States window
new analysis type
new project
no-allow-multiple-runs amplxe-cl option
no-analyze-system amplxe-cl option
no-follow-child amplxe-cl option
no-trace-mpi amplxe-cl option
normal band height
notational conventions
NT Kernel Logger
numa: % of remote accesses
occupancy
open
View Comparison Data
comparison result
OpenCL™ kernels
GPU OpenCL™ Application Analysis
OpenCL™ Kernel Analysis Metrics Reference
analyze
metrics
openmp region time
OpenMP*
OpenMP* Analysis
analysis
analysis from command line
interpret analysis data
OpenMP* region duration histogram
options
Configure Analysis Options from Command Line
Pane: Options - Source/Assembly
configure from the command line
source/assembly
OS X
other
Other
Other
Other
other serial cpu time
outgoing bandwidth bound
outgoing packet rate bound
output
overhead
Minimize Collection Overhead
Spin and Overhead Time
minimize for collection
time
overhead time
overview GPU event metrics
P-state
P-state metric
page walk
panes
Pane: Options - General
Pane: Options - Result Location
Pane: Options - Source/Assembly
Call Stack
Options - General
Options - Result Location
Options - Source/Assembly
Project Navigator
Timeline
parallel region time
password-less mode
pause
Pause Data Collection
analysis
collect action
Pause button
pause/resume API
paused time
Window: Summary - Hotspots
Paused Time
percent
Perf*-based collection
performance
Analyze Performance
analysis tree
performance analysis
performance metrics
MUX Reliability
Contested Accesses
Local DRAM
MCDRAM Bandwidth Bound
MCDRAM Flat Bandwidth Bound
Pipeline Slots
OpenMP Region Time
Effective CPU Utilization
L2 Replacements
SP GFLOPS
Effective Time
Average CPU Frequency
SIMD Compute-to-L2 Access Ratio
Inactive Wait Time
Inactive Time
Ratio to Max Bandwidth, %
Length Changing Prefixes
CPI Rate
L2 Miss Count
Wait Time
Port 3
L1D Replacements
LLC Miss Count
Instance Count
FP x87
Local Persistent Memory
GPU L3 Misses, Misses/sec
Divider
Reduction
Vector Instruction Set
Bad Speculation (Back-End Bound Pipeline Slots)
Tasking
Split Stores
CPU Time
L1 Bound
FP Assists
Front-End Bandwidth MITE
CPU Time
Total Time in C0 State
SIMD Assists
Atomics
Typed Memory Write Bandwidth, GB/sec
Creation
Communication (MPI)
L2 Hit Bound
ITLB Overhead
MS Entry
Cycles of 0 Ports Utilized
Port 5
% of 256-bit Packed Floating Point Instructions
Elapsed Time
GPU Memory Write Bandwidth, GB/sec
Memory Efficiency
OpenMP* Analysis. Collection Time
Average Time
GPU L3 Bound
Sampler Busy
Estimated BB Execution Count
EU Array Active
Port Utilization
CPU Frequency
Remote Cache Access Count
SIMD Compute-to-L1 Access Ratio
GPU EU Array Usage
Pre-Decode Wrong
Far Branch
Bad Speculation (Cancelled Pipeline Slots)
Port 2
Average Latency (cycles)
Inactive Wait Count
SMC Machine Clear
Clears Resteers
FP Arithmetic/Memory Write Instructions Ratio
UTLB Overhead
Store Latency
LLC Load Misses Serviced By Remote DRAM
Total Time in Non-C0 States
L3 Latency
Remote DRAM
Store Bound
GPU Utilization
Ideal Time
DRAM Bound
Remote / Local DRAM Ratio
Core Bound
Size
Typed Writes Coalescence
Inactive Sync Wait Count
EU 2 FPU Pipelines Active
FPU Utilization
DSB Switches
Front-End Bandwidth DSB
Bus Lock
Computing Threads Started, Threads/sec
Front-End Latency
Port 4
Page Walk
Incoming Bandwidth Bound
Cycles of 2 Ports Utilized
Elapsed Time
MO Machine Clear Overhead
Mispredicts Resteers
SP FLOPs per Cycle
Lock Contention
Local
L2 Hit Rate
Split Loads
Loop Type
False Sharing
Effective Physical Core Utilization
Imbalance or Serial Spinning
Cycles of 1 Port Utilized
BACLEARS
Ratio to Max Bandwidth, %
Available Core Time
Typed Memory Read Bandwidth, GB/sec
Slow LEA Stalls
Wake-ups
Untyped Memory Read Bandwidth, GB/sec
Cache Bound
Instruction Starvation
FP Scalar
GPU L3 Misses
Total Time
Paused Time
% of Scalar FP Instructions
Task Time
GPU Memory Read Bandwidth, GB/sec
Port 0
Imbalance or Serial Spinning
Loop Entry Count
EU Array Idle
Untyped Memory Write Bandwidth, GB/sec
DRAM Bandwidth Bound
EU Threads Occupancy
Shared Local Memory Write Bandwidth, GB/sec
Memory Latency
L1D Replacement Percentage
Other
Wait Rate
L2 Replacement Percentage
Contested Accesses (Intra-Tile)
L2 Input Requests
Total Time in S0 State
Render/GPGPU Command Streamer Loaded
Machine Clears
Other
Elapsed Time
LLC Hit
Remote Cache
LLC Replacement Percentage
ICache Line Fetch
FP Vector
Port 7
VPU Utilization
MS Switches
GPU L3 Miss Ratio
Shared Local Memory Read Bandwidth, GB/sec
OpenMP* Potential Gain
SQ Full
[uOps]
Logical Core Utilization
CPI Rate (Intel Atom® processor)
FB Full
L3 Bound
NUMA: % of Remote Accesses
GPU Memory Texture Read Bandwidth, GB/sec
SIMD Width
MPI Imbalance
L3 Shader Bandwidth, GB/sec
Typed Reads Coalescence
Parallel Region Time
Port 1
Memory Bandwidth
Lock Contention
LLC Miss Ratio due GPU Lookups
SIMD Instructions per Cycle
Incoming Packet Rate Bound
Branch Mispredict
Vector Capacity Usage
Memory Bandwidth
Flags Merge Stalls
Other
L3 Sampler Bandwidth, GB/sec
Total Wake-up Count
Max DRAM Single-Package Bandwidth
Estimated Ideal Time
Microcode Sequencer
% of Packed FP Instructions
CPU Utilization
% of Packed SIMD Instructions
Global
Front-End Bandwidth LSD
Sampler Is Bottleneck
Memory Bound
Total, GB/sec
GPU Texel Quads Count, Count/sec
Scheduling
General Retirement
Wake-ups/sec per Core
Inactive Sync Wait Time
Outgoing Packet Rate Bound
MPI Busy Wait Time
MS Assists
L1 Hit Rate
I/O Wait Time
Untyped Reads Coalescence
Max DRAM System Bandwidth
EU Array Stalled/Idle
L1I Stall Cycles
Port 6
Occupancy
Computing Threads Started
Loads Blocked by Store Forwarding
Retiring
% of Scalar SIMD Instructions
ICache Misses
Cycles of 3+ Ports Utilized
Front-End Bandwidth
Reference
Average Bandwidth
SP FLOPs per Cycle
Average CPU Usage
Remote DRAM Access Count
Average Logical Core Utilization
4K Aliasing
Front-End Bound
DTLB Overhead
EU IPC Rate
Total Iteration Count
Untyped Writes Coalescence
Ratio to Max Bandwidth, %
Average Physical Core Utilization
MPI Rank on the Critical Path
Spin Time
L2 Miss Bound
Inactive Wait Time with poor CPU Utilization
Idle Wake-ups
Imbalance
Branch Resteers
Wait Count
Thread Oversubscription
Assists
Overhead Time
LLC Miss
IPC
L2 Bound
Retire Stalls
EU Send pipeline active
EU Array Stalled
FP Arithmetic
Outgoing Bandwidth Bound
% of 128-bit Packed Floating Point Instructions
Back-End Bound
FP Arithmetic/Memory Read Instructions Ratio
L2 HW Prefetcher Allocations
Local DRAM Access Count
MCDRAM Cache Bandwidth Bound
CPU Time
LLC Replacements
Data Sharing
Lock Latency
DTLB Store Overhead
GPU Utilization
Execution Stalls
Serial Time (outside parallel regions)
Available Core Time
CPU Utilization (OpenMP)
CPU
GPU
performance monitoring unit
performance target
PGO analysis
platform analysis
Platform analysis
Platform Profiler Analysis (Preview)
Platform Analysis Group
CPU/GPU Concurrency Analysis
Platform Power Analysis
Platform Power Analysis viewpoint
Platform Profiler analysis
Platform Profiler Analysis (Preview)
Platform Profiler Results (Preview)
interpret results
Platform tab
Platform window
Window: Platform
in Hotspots viewpoint
PMU
PMU analysis
pop-up menu
Context Menus: Call Stack Pane
Context Menus: Project Navigator
for grid
port 0
port 1
port 2
port 3
port 4
port 5
port 6
port 7
port utilization
Potential Gain
power analysis
pre-decode wrong
precise events
prepare
Prepare an Android* Application for Analysis
Set Up Android* System
Set Up FreeBSD* System
Set Up Linux* System for Remote Analysis
Set Up Remote Windows* Target
Android application for analysis
Android system for analysis
FreeBSD system for analysis
Linux* system for analysis
Windows* system for analysis
prepare-debugfs.sh
Prerequisites
preview feature
problem
Problem: System Functions Appear in the User Functions Only Mode
Problem: VTune Amplifier is Slow to Respond When Collecting or Displaying Data
Problem: Same Functions Are Compared As Different Instances
Problem: 'Events= Sample After Value (SAV) * Samples' Is Not True If Multiple Runs Are Disabled
Problem: Stacks in Call Stack and Bottom-Up Panes Are Different
Problem: Stack in the Top-Down Tree Window Is Incorrect
CPU time for Hotspots analysis is too low
inaccurate sum in the grid
no data
skipped stack frames
unexpected Paused time
unknown frames
problem
Problem: Analysis of the .NET* Application Fails
Problem: Unreadable Text in Intel VTune Amplifier on macOS*
Problem: Guessed Stack Frames
Problem: Information Collected via ITT API Is Not Available When Attaching to a Process
unknown timer
processor event reference
product documentation
profile
Managed Code Targets
Analyze Statically Linked Binaries on Linux* Targets
Pause Data Collection
analyze
Managed Code Targets
managed code
managed code
MRTE functions
Managed Code Targets
profile
statically linked binaries
with data collection paused
project
project
Search Directories
search directories
project navigator
Context Menus: Project Navigator
context menu
Project Navigator pane
PS EU active %
PS EU Stall %
Python* code analysis
quick start
quiet amplxe-cl option
ratio to max bandwidth, %
Ratio to Max Bandwidth, %
Ratio to Max Bandwidth, %
Ratio to Max Bandwidth, %
ratios
CPU Metrics Reference
Reference
Re-resolve button
recursion
redirect
Save and Format Command Line Reports
report to a file
reduction
reference
Getting Help
CPU metrics
energy analysis metrics
GPU metrics
OpenCL kernel analysis metrics
performance metrics
processor events
regression testing
Difference Report
with amplxe-cl
related information
remote / local dram ratio
remote analysis
Set Up Linux* System for Remote Analysis
Set Up Android* System
Set Up Remote Windows* Target
Set Up FreeBSD* System
Configure SSH Access for Remote Collection
from CLI on Android
from command line
run
remote analysis
Android* Targets
Set Up Remote Linux* Target
workflow
Android* Targets
Set Up Remote Linux* Target
remote cache
remote cache access count
remote collectors
Set Up Linux* System for Remote Analysis
Set Up Remote Windows* Target
Configure SSH Access for Remote Collection
remote dram
remote dram access count
remote target system
render/gpgpu command streamer loaded
report amplxe-cl option
report options
filter
group-by
source-object
filter
group-by
source-object
report problem
Report Problems from Command Line
amplxe-feedback
report problems
report-knob amplxe-cl option
report-output amplxe-cl option
report-width amplxe-cl option
reports
Generate Command Line Reports
filtering
formatting
gprof-cc
gpu-computing-tasks
hotspots
hw-events
saving
summary
top-down
resolve
Problem: Unknown Frames
unknown frames
result
Pane: Options - Result Location
Manage Data Views
location
tab
result directory
Specify Result Directory from Command Line
specify
result-dir amplxe-cl option
Resume button
resume-after amplxe-cl option
retire stalls
retiring
return-app-exitcode amplxe-cl option
rich band height
ring-buffer amplxe-cl option
run
Control Data Collection
Run Command Line Analysis
Generate Command Line Configuration from GUI
analysis
analysis from command line
analysis remotely
VTune Amplifier
runsa
runss
s0ix state metric
s0ix-state
sample after value
sample count
Sample Count window
sampler busy
sampler is bottleneck
samples
samples blended
samples killed in PS, pixels
samples written
sampling
Sampling Interval
interval
sampling driver
sampling drivers
Sampling Drivers
Build and Install the Sampling Drivers for Linux* Targets
Install the Sampling Drivers for Windows* Targets
sampling drivers
save report
Save and Format Command Line Reports
amplxe-cl
SC Device States window
scale
scheduling
scientific notation
search
Search Directories
Search Order
Search for Data
order
search directories
Dialog Box: Source Search
Dialog Box: Binary/Symbol Search
amplxe-cl
for Android
for Linux
search-dir amplxe-cl option
select
Metrics Distribution Over Call Stacks
Set Up Analysis Target
stack type
target
Select All option
self time
serial cpu time
serial time (outside any parallel region)
shared local memory read bandwidth, gb/sec
shared local memory write bandwidth, gb/sec
show
Choose Data Format
Manage Timeline View
data as option
time scale as menu option
Show All Columns option
Show Data As option
Show Grouping Area option
show-as amplxe-cl option
show-issues
simd assists
simd compute-to-l1 access ratio
simd compute-to-l2 access ratio
simd instructions per cycle
simd width
SIMD width
size
skid
skipped frames
skipped stack frames
slow lea stalls
smc machine clear
software tuning
Hotspots Analysis Group
Parallelism Analysis Group
sort
Manage Grid Views
timeline data
sort-asc amplxe-cl option
sort-desc amplxe-cl option
source
Source Code Analysis
context menu
toolbar
source files
Pane: Options - Source/Assembly
cache
source search
Dialog Box: Source Search
dialog box
Source Search button
source-object amplxe-cl option
source-search-dir amplxe-cl option
Source/Assembly pane
sp flops per cycle
SP FLOPs per Cycle
SP FLOPs per Cycle
sp gflops
SPDK code analysis
spin time
Spin and Overhead Time
Spin Time
split loads
split stores
sq full
SSH access configuration
stack
Pane: Call Stack
pane
type
stack sampling
stack-size amplxe-cl option
stacks
View Stacks
stitch
standalone
Standalone VTune Amplifier Graphical Interface
VTune Amplifier interface
start
Pause Data Collection
collection paused
start analysis
Control Data Collection
with hot key
Start button
Start Paused button
start-paused amplxe-cl option
statically linked binaries
Analyze Statically Linked Binaries on Linux* Targets
profile
stitch stacks
stop analysis
Control Data Collection
with hot key
Stop button
store bound
store latency
strategy amplxe-cl option
string handle API
summary
Summary Report
report
summary amplxe-cl option
Summary window
Window: Summary - Hotspots
Comparison Summary
in Hardware Events viewpoint
in Hotspots by CPU Utilization viewpoint
in Hotspots by Thread Concurrency viewpoint
in Locks and Waits viewpoint
in Memory Usage viewpoint
in Microarchitecture Exploration viewpoint
Summary window
super tiny band height
support
SVM usage type
symbol server
synchronization
View Stacks
Control Window Synchronization
between bottom-up and call stack panes
between top-down tree and bottom-up windows
synchronization APIs
syntax
amplxe-cl Command Syntax
for amplxe-cl tool
system analysis
Collect System-Wide Data from Command Line
from command line
system bandwidth
system functions
Call Stack Mode
view
system overview analysis
System Sleep States
Window: System Sleep States - Platform Power Analysis
window
Systrace* events
tab
Set Up Analysis Target
Window: Summary - Locks and Waits
Analysis Target
Summary
tabs
Window: Platform
Platform
target
target concurrency
target system
Set Up Analysis Target
WHERE: Analysis System
Linux*
prepare for analysis
Set Up Linux* System for Remote Analysis
Set Up Remote Windows* Target
sampling drivers
Windows*
target type
target-duration-type amplxe-cl option
target-install-dir amplxe-cl option
target-pid amplxe-cl option
target-process amplxe-cl option
target-system amplxe-cl option
target-tmp-dir amplxe-cl option
task analysis
Task Analysis
interpret
task APIs
task time
tasking
temperature
temperature metric
Temperature window
temporary directory
thread concurrency
thread naming APIs
thread oversubscription
threading analysis
Threading Analysis
Threading Efficiency View
about
interpret results
Threading Efficiency viewpoint
threshold
Change Threshold Values
change
time
Choose Data Format
overhead
self
spin
total
time scale
time-filter amplxe-cl option
timeline
Timeline pane
Manage Timeline View
context menu
manage
timer resolution
timer resolution metric
Timer Resolution window
tiny band height
toolbar
Toolbar: VTune Amplifier
filter
source/assembly
VTune Amplifier
Top-down Tree
Control Window Synchronization
synchronize with bottom-up window
Top-down Tree window
Top-down Tree window (compare mode)
topdown
Top-down Report
report
total iteration count
total time
Window: Top-down Tree
Total Time
Self Time and Total Time
for computing task
total time in c0 state
total time in non-c0 states
total time in s0 state
total wake-up count
total, gb/sec
trace-mpi amplxe-cl option
troubleshooting
Problem: Unexpected Paused Time
Problem: System Functions Appear in the User Functions Only Mode
Problem: Analysis of the .NET* Application Fails
Problem: Same Functions Are Compared As Different Instances
Error Message: Intel® Graphics Driver Is Obsolete
Problem: Unreadable Text in Intel VTune Amplifier on macOS*
Problem: Guessed Stack Frames
Troubleshooting
Problem: Stacks in Call Stack and Bottom-Up Panes Are Different
Problem: Stack in the Top-Down Tree Window Is Incorrect
Error Message: Cannot Open Data
drivers
troubleshooting
Problem: Cannot Access VTune Amplifier Documentation
Problem: No GPU Usage Data Is Collected
tune
MPI Code Analysis
MPI applications
tuning methodology
tutorials
typed memory read bandwidth, gb/sec
typed memory write bandwidth, gb/sec
typed reads coalescence
typed writes coalescence
Uncore Event Count window
unknown frames
Problem: Unknown Frames
User-Mode Sampling and Tracing Collection
unplug device
unplugged-mode amplxe-cl option
unreadable text
untyped memory read bandwidth, gb/sec
untyped memory write bandwidth, gb/sec
untyped reads coalescence
untyped writes coalescence
unwind
User-Mode Sampling and Tracing Collection
stack
User APIs
user tasks
user-data-dir amplxe-cl option
user-mode sampling and tracing analysis
user-mode sampling and tracing analysis from command line
utilization
Window: Summary - Hotspots
average
bar
utilization threshold
Change Threshold Values
change
utlb overhead
vector capacity usage
vector instruction set
vectorization
HPC Performance Characterization Analysis
Window: Summary - HPC Performance Characterization
HPC Performance Characterization View
verbose amplxe-cl option
version amplxe-cl option
view
Generate Command Line Reports
amplxe-cl reports
amplxe-cl results in GUI
analysis results
comparison data
inline functions
loops
output
power consumption collected remotely
source code difference
stack
stacks
View Source option
viewpoint
Microarchitecture Exploration View
Microarchitecture Exploration
viewpoints
Switch Viewpoints
CPU/FPGA Interaction
Hotspots
Hotspots by CPU Usage
HPC Performance Characterization
Memory Usage
Threading Efficiency
virtual environment
virtual machine performance analysis
vpu utilization
VS EU Active
VS EU Stall
VTune Amplifier
Introduction
command line interface
filenames and locations
reference help
standalone interface
wait count
wait rate
wait time
wake-ups
wake-ups/sec per core
Wakelocks window
warnings
Warnings about Accurate CPU Time Collection
in Intel VTune Amplifier
what is new
What's This Column option
Wind River Linux
windows
Manage Grid Views
Manage Data Views
Bandwidth
Caller/Callee
cannot find
Collection Log
Compare Results
Core Wake-ups
Correlate Metrics
CPU C/P states
Debug
Event Count
Graphics - GPU Hotspots
Graphics - Hotspots
Graphics C/P states
NC Device States
Platform
Sample Count
SC Device States
summary
Summary
Summary - Disk Input and Output
Summary - GPU Hotspots
Summary - Locks and Waits
System Sleep States
Temperature
Timer Resolution
Top-down Tree
Top-down Tree (compare mode)
Uncore Event Count
Wakelocks
Windows Store Applications support
Windows targets
work size
workflow
Intel® Xeon Phi™ Processor Targets
Android* Targets
Set Up Remote Linux* Target
Intel Xeon Phi processor analysis
remote analysis
Android* Targets
Set Up Remote Linux* Target
wrong stack
WSA support
yocto
Yocto Project
Configure Yocto Project* and Intel® VTune™ Amplifier with the Linux* Target Package
Configure Yocto Project*/Wind River* Linux* and Intel® VTune™ Amplifier with the Intel System Studio Integration Layer
Embedded Linux* Targets
Configure Yocto Project* and Intel® VTune™ Amplifier with the VTune Amplifier Integration Layer
Yocto*
zoom in
Manage Timeline View
timeline
zoom out
Manage Timeline View
timeline