Address Unique Needs in Cloud & HPC Profiling

Achieving the best performance for a high-performance computing (HPC) application requires a careful balance of a message passing interface (MPI) parallelism, threading, vectorization, memory access, and more. Intel® VTune™ Profiler provides specialized HPC analyses to let developers start with a quick snapshot, and then, if needed, get more details. Software architects tuning the performance of cloud applications will appreciate the ability to profile a running Java* process in a container.

Get a Quick Performance Snapshot

Analyze MPI and non-MPI applications. (Linux* only)

The application performance snapshot features:

  • Lightweight, low overhead profiling
  • Scalable profiling detects performance variation with a large number of ranks
  • Key metrics, such as MPI and OpenMP* imbalance, low floating-point utilization, communication patterns, and memory stalls

Determine whether this workload will benefit from tuning by viewing all the data in one place.

Deeper Analysis with Actionable Detail

See a summary of key HPC performance attributes: MPI efficiency, threading efficiency, memory access efficiency, and floating-point utilization. Then dive into the details and optimize the highest impact items first.

Use the HPC analysis to get a fast overview of critical metrics for modern hardware performance or get a more in-depth analysis for each one.

The summary now includes improved vectorization metrics, process and thread affinity, and a preview of Lustre* parallel file I/O metrics.

Easier Multirank Analysis of MPI and OpenMP*

For hybrid MPI and OpenMP applications, it is important to explore OpenMP inefficiency along with MPI communication between ranks. The lower the communication spin time, the more the rank is executing, and the more impact OpenMP tuning has.

Intel VTune Profiler can be installed on a cluster. For further tuning of MPI, use Intel® Trace Analyzer and Collector.

The list shows OpenMP regions where performance tuning can significantly reduce execution time, with the highest impact regions shown first.

Optimize Private Cloud-Based Applications

Profile enterprise applications written in Java* or in native languages like C, C++, and Fortran. Profile running Java services (like mail and daemons) without restarting the application. Popular containers that include Docker*, Mesos*, and LXC* are supported.

Intel VTune Profiler can easily attach to an application running in a container to collect profiling data.

Additional Capabilities

Single Thread

Optimize single-threaded performance.


Effectively use all available cores.


See a system-level view of application performance.

Media & OpenCL™ Applications

Deliver high-performance image and video processing pipelines.

Memory & Storage Management

Diagnose memory, storage, and data plane bottlenecks.

Analyze & Filter Data

Mine data for answers.


Fits your environment and workflow.

Are you ready to try or purchase Intel VTune Profiler?

Informações de produto e desempenho


Os compiladores da Intel podem ou não otimizar para o mesmo nível de microprocessadores não Intel no caso de otimizações que não são exclusivas para microprocessadores Intel. Essas otimizações incluem os conjuntos de instruções SSE2, SSE3 e SSSE3, e outras otimizações. A Intel não garante a disponibilidade, a funcionalidade ou eficácia de qualquer otimização sobre microprocessadores não fabricados pela Intel. As otimizações que dependem de microprocessadores neste produto são destinadas ao uso com microprocessadores Intel. Algumas otimizações não específicas da microarquitetura Intel são reservadas para os microprocessadores Intel. Consulte os Guias de Usuário e Referência do produto aplicáveis para obter mais informações sobre os conjuntos de instruções específicos cobertos por este aviso.

Revisão do aviso #20110804