Intel® VTune™ Amplifier
In this mode, Intel® VTune™ Amplifier collects two traces in parallel: system-wide performance data trace on the host and OS-level event trace on the guest system. These traces get merged into one VTune Amplifier result and provide:
simultaneous analysis of user space activity (processes, threads, functions) from the host on the guest system;
accurate attribution of collected data to the user processes running on the guest, based on the timestamp synchronization.
This usage mode provides the following advantages:
VMs are not required to virtualize performance counters. All performance analysis features are available to VM users out of the box.
Sampling drivers (VTune Amplifier sampling driver or Perf*) do not need to be installed on a guest VM.
To enable KVM kernel and user space profiling from the host:
Explore the collected data by enabling all the grouping levels containing a VM component to differentiate the host and target data.
Example 1: Hotspots Analysis (Hardware Event-Based Sampling Mode)
Analyze hotspots for both an application launched from the Linux host, app-from-host, and an application launched on the KVM guest system, app-in-vm:
Example 2: Microarchitecture Exploration Analysis
Analyze the efficiency of the Microarchitecture Usage for the application launched on the KVM guest system. The context summary on the right pane shows the hardware metrics for the thread (launched inside the KVM) selected in the grid:
Minimum Linux kernel version for host system is 4.9.
debugfs is mounted on both host and guest system.
Irrespective of the number of KVM/Qemu processes running, only one running VM instance can be profiled.
In the result view, threads with the same name may be grouped into one process (ftrace).
In the result view, samples before the first context switch may be attributed to the hypervisor thread on the host.