To enable hardware event-based sampling analysis on your platform, the Intel® VTune™ Amplifier uses sampling drivers that require root privileges for installation on the Linux* and Android* systems.
If, for some reasons, the sampling drivers cannot be installed (for example, you do not have root privileges on the system) with the product installation, the VTune Amplifier uses the Perf* utility, which is part of the default VTune Amplifier installation package, and runs the hardware event-based sampling analysis in the driverless mode.
VTune Amplifier is installed to your default account. For non-root users, it provides a notification during the installation claiming that the sampling driver cannot be installed, so some product features could be limited or unavailable. To have the sampling driver installed, you need to re-start the install process under the root account or contact your administrator.
Prerequisites for Driverless Collection
VTune Amplifier can use the driverless Perf-based collection if the following requirements are satisfied:
Your system is based on kernel 2.6.32 or higher, which exports CPU PMU programming details over /sys/bus/event_source/devices/cpu/format file system.
Perf-based collection is enabled in the kernel with a /proc/sys/kernal/perf_event_paranoid value equal to or less than 1.
For uncore event analysis, uncore_* devices are available in the /sys/bus/event_source/devices folder.
VTune Amplifier sampling drivers are unavailable on the system.
Driverless Collection Modes
VTune Amplifier supports the following Perf-based collection types:
Driverless Perf per-process sampling collects samples for a single process and/or its children and can be done simultaneously from multiple monitoring processes. Since it requires performance counters virtualization per process, it can bring more overhead in comparison with system-wide collection. Typically, a system has this type of collection enabled by default.
Driverless Perf system-wide sampling is performed by one monitoring process for the whole system. It usually has less overhead since it does not require to virtualize counters per process. This collection type can collect uncore counters and requires kernel configuration.
Driverless Perf per-process counting provides event counting statistics over an interval for a single process or its children. Event counting can be done simultaneously from multiple monitoring processes. Since it requires performance counters virtualization per process, it can bring more overhead in comparison with system-wide collection. Typically, a system has this type of collection enabled by default.
Driverless Perf system-wide counting provides event counting statistics performed by one monitoring process for the whole system over an interval. It usually has less overhead since it does not require to virtualize counters per process. This collection type can collect uncore counters and requires kernel configuration.
To configure system-wide driverless collection:
Set the /proc/sys/kernel/perf_event_paranoid value to 0 or less. Root privileges are required.
For the kernel modules resolution, make sure you have enough permissions to read kernel symbols information from the /proc/kallsyms file.
To check the data collection type used for your analysis:
Scroll down to the Collection and Platform Info section in the Summary window and check the Collector Type value:
Perf-based driverless collection is applicable to all hardware event-based sampling analysis types, such as Hotspots (hardware event-based sampling mode), Microarchitecture Exploration, and Custom event-based sampling analysis types on Linux and Android OS. If the uncore events support is available on the system, the VTune Amplifier also uses the Perf collection for Memory Access, HPC Performance Characterization, and Microarchitecture Exploration analysis types with the Analyze memory bandwidth option enabled.
The following additional limitations are also possible for the driverless collection:
Since the driverless collection is based on the Linux Perf functionality, all Perf limitations fully apply to the VTune Amplifier sampling analysis as well. For example, your operating system limits on the maximum amount of files opened by a process as well as maximum memory mapped to a process address space still apply and may affect Perf-based profiling. For more information, see the Tutorial: Troubleshooting and Tips topic at https://perf.wiki.kernel.org/index.php/Main_Page.
Local and remote Launch Application, Attach to Process and Profile System target types are supported but this support fully depends on the Linux Perf profiling credentials specified in the /proc/sys/kernel/perf_event_paranoid file and managed by the administrator of your system using root credentials. For more information see the perf_event related configuration files topic at http://man7.org/linux/man-pages/man2/perf_event_open.2.html. By default, only user processes profiling at both user and kernel spaces is permitted, so you need granting wider profiling credentials via the perf_event_paranoid file to employ the Profile System target type.
Memory bandwidth analysis is not supported on Intel Atom® processors.
Run the <install-dir>/bin64/amplxe-self-checker.sh script to explore the analysis type collection abilities of your system. The script output helps recognize limitations and provides advice on fixing them.