Diagnose Memory, Storage & Data Plane Bottlenecks

Not all workloads are compute-bound. Intel® VTune™ Profiler has specialized analyses for optimizing the use of memory and I/O bandwidth.

Optimize Bandwidth-Limited Software

Use the timeline to see the spikes in bandwidth used for DRAM and Intel® QuickPath Interconnect. To see which functions are consuming bandwidth at a specific time, select a spike in the timeline and filter on the selection. This lets you isolate the individual contributors to bandwidth consumption and tune effectively.

Functions that are significantly memory bound are highlighted in pink.

Identify Which Memory Objects Are Bottlenecks

A typical hotspot analysis shows code that is taking the most time. The Memory Access analysis offers a different perspective—it shows which memory objects cause performance issues, independent of where they are accessed. This can yield new insight on how to improve performance.

Available for Linux* targets only.

Tune Non-Uniform Memory Access (NUMA)

Some memory accesses can be slower than others. For example, on a two-socket system, latency is higher when a core in socket 0 accesses memory that is attached to socket 1. Memory Analysis in Intel VTune Profiler lets you identify frequently accessed data that is stored remotely and reconsider how you allocate memory. Memory access analysis shows both local memory access (which is fast) and remote memory access (which is slow). Changing your memory allocation to improve local access may improve performance.

Design & Optimize for Persistent Memory

Memory Access analysis also helps you decide which objects to allocate in Intel® Optane™ DC persistent memory. Place the hottest objects in DRAM, warm objects in persistent memory, and cold objects on an SSD or disk.

Available for Linux* targets only.

Uncover I/O Bottlenecks

Determine whether your application is I/O-bound or CPU-bound by exploring imbalance between I/O operations (synchronous and asynchronous) and compute. See when the CPU is waiting for I/O, and see storage accesses mapped to the source code.

Sliders on the histogram control the display of data in the grid and on the timeline, making data analysis easier.

Tune Polled I/O Using the Data Plane Development Kit (DPDK) & Storage Performance Development Kit (SPDK)

DPDK and SPDK are built for fast packet processing and high-performance storage. Both operate in polled mode instead of using interrupts. Applications check for more work from the NIC (DPDK) or the SSD (SPDK). The problem with most profilers is there is no way to tell if a thread is heavily loaded or lightly loaded because polling always puts the CPU use at 100 percent. But because Intel VTune Profiler can track cycles where no work is done, it can show you which threads are heavily loaded and which are not.

Analysis with SPDK goes beyond simple aggregate data for the I/O channel and details the data for each attached device. This gives you a more detailed picture of complex I/O workloads.

Set up I/O analysis with just a few clicks.

Data Plane Development Kit

Storage Performance Development Kit

Determine Which Systems Benefit from Faster Storage

Storage Performance Snapshot shows system storage bottlenecks for servers and workstations with directly attached storage. Easy to install, this tool helps you determine which workloads need further analysis and where faster storage improves performance. This snapshot comes with Intel VTune Profiler and is also available separately to facilitate a quick system check.

Get a quick view of:

  • I/O boundedness
  • Storage and network saturation
  • CPU utilization
  • Memory capacity saturation

Get system data while running workloads to see how migration to Serial ATA and PCIe* SSDs can offer better solutions, user experiences, and performance density.

Storage Performance Snapshot

Collect data on Windows* or Linux systems and view the results in a web browser.

Additional Capabilities

Single Thread

Optimize single-threaded performance.


Effectively use all available cores.


See a system-level view of application performance.

Media & OpenCL™ Applications

Deliver high-performance image and video processing pipelines.

HPC & Cloud

Access specialized, in-depth analyses for HPC and cloud computing.

Analyze & Filter Data

Mine data for answers.


Fits your environment and workflow.

Are you ready to try or purchase Intel VTune Profiler?

Product and Performance Information


Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804