Using the VTune™ Performance Analyzer Sampling Collector for Intel® Atom™ Processor based devices


Introduction

When collecting performance data for the Intel® VTune™ Performance Analyzer on an Intel® Atom™ Processor based device the first consideration is that the usually small form factor makes it inconvenient to sample and analyze the collected data on it. Thus the obvious solution is to have a sampling collector on your Intel® Atom™ Processor based device, but instead of analyzing it locally you simply transfer the sampling results file (*.tb5) to your development host machine.

The Intel® Application Software Development Tool Suite for Intel® Atom™ Processor includes a full featured Intel® VTune™ Performance Analyzer for Linux* and it also includes a sampling collector for the Intel® Atom™ processor. Both components are license limited to only permit usage for Intel® Atom™ Processor targeted performance optimization work.
The Intel® VTune™ Performance Analyzer sampling data collector is a standalone command-line tool that provides the event-based sampling (EBS) functionality on a local system. The hardware-based sampling is a low-overhead, system-wide profiling that helps to identify which modules and functions are consuming the most time, giving a detailed look at the operating system and application. This tool enables you to configure the data collection, perform the system-wide profiling, and store the results in a file.


Installation

The VTune™ Performance Analyzer sampling data collector installation routine will check whether the existing data operating system matches one of the exact operating system versions of Moblin, Midinux* 2.0 or Ubuntu* Mobile Edition that prebuilt sampling collectors are provided for. To collect event based sampling data it is necessary to read out associated performance counters on the Intel® Atom™ Processor performance monitoring unit (PMU). This data will then be exported from kernel space into user space. This can only be achieved by implementing the sampling collector as a kernel module.

If the target device does not have one of the operating systems in the exact expected version installed, the sampling collector kernel module will have to be build on the target system or on a compatible system of the same kernel version.

  • Log into the target MID Linux* system as root
  • Copy the vtune91_target.tar.gz target image for the sampling collector onto the target and unpack it with "tar -xvzf vtune91_target.tar.gz"
  • Assuming your target OS meets the requirements of having a full GCC*, binutils and Linux headers installation as well as the libstdc++.so.5 C++ 5.0 library compatibility package present you can then install the sampling collector by executing the install-vtune-sep.sh installation script and following the instructions provided in it.

 


Sampling

After the installation is complete you may switch to the vdk subdirectory and run the script ./insmod-vtune.sh to load the sampling collector driver.

To perform the actual sampling you would then issue a command like sep -start -d 20 -out MyData.

This would start time based sampling immediately for 20 seconds and write the results to the tb5-file MyData.

 

Generic Collector Options

 

[-d | -duration <in seconds>]

Specify duration for the sampling collection

 

[-nb | -non-blocking]

Switch to non-blocking mode

 

[-c | -count]

Count selected events

 

[-ce | -count-emon]

Count selected events in EMON style

 

[-rc | -rerun-based-on-count]

Run to calculate Sample After value

 

[-si | -sampling-interval <interval in milliseconds>]

Specify the amount of time between samples

 

[-sb | -sample-buffer-size <size in kilobytes>]

Specify the buffer-size for storing sample data

 

[-sd | -sampling-delay <delay in seconds>]

Specify delay of data collection

 

[-msc | -max-samples-to-collect <maximum number of samples to collect>]

Specify a total of samples to collect before stopping data collection

 

[-sm | -sampling-method <ebs|tbs>]

Select event or time-based sampling

 

[-sp | -start-paused]

Start data collection in paused mode

 

[-out | -output-file <file name>]

Specify the file name for the output file

 

[-of | -options-from-file <file name>]

Read the sampling collector options from a file

 

[-app <full-path-to-the-application>]

Specify the application to be launched for data collection

 

Event Specific Options

 

This section describes the options for selecting events.

[-ec | -event-config]

Configure the events that are sampled

 

Event configuration options begin with

 

 

The modifiers can be generic to an event as well as specific to a constraint (or an event qualifier). The constraint specific special modifiers appear after

 

–ec | -event-config switch. Specify the event(s) to monitor and embed the event names within single quotes(‘). The [:modifier=val] option enables you to specify individual event modifiers along with the respective values for a given platform. [/constraint=]. The modifier values can be in decimal or hexadecimal format. Only specific modifiers accept the value as a string. Each event specification is delimited by a comma (,).

[-dc | -data-config]

Configure the data that is collected

 

Syntax

[-ec | -event-config [-dc | -data-config <optional-data1>,<optional-data2>… ] '<event name1>':modifier1=val:modifier2=val/constraint1={:modifier3=val:modifier4=val}, '<event-name2>'...]

 


Analyzing Sampling Results

After copying the resulting tb5file to the host development machine, you may simply launch /opt/intel/vtune/bin/vtlec and read in the very tb5 sampling result file to get a fully GUI supported view of the analysis and being able to identify where the hotspots are that may cause negative performance impact and where it makes most sense to apply any potentially necessary recoding work.

 


Summary

The Intel® VTune™ Performance Analyzer Sampling Collector included in the both the Intel® Application Software Development Tool Suite for Intel® Atom™ Processor and the Intel® Embedded Software Development Tool Suite for Intel® Atom™ Processor provides a convenient and straightforward way to determine where potential performance hotspots that require a closer look for optimization opportunities are located. The sampling collector conveniently supports the cross-development model the small form factor of MIDs and similar devices  makes necessary for development and testing of end-user applications.

 


AllegatoDimensione
Scarica user-guide.pdf191.27 KB
Scarica vtune-user-guide.pdf191.27 KB
Per informazioni complete sulle ottimizzazioni del compilatore, consultare l'Avviso sull'ottimizzazione