Using Intel® VTune™ Amplifier 2014 for Systems on an embedded Haswell processor

Using Intel® VTune™ Amplifier 2014 for Systems on an embedded Haswell  processor

Introduction

Intel® System Studio is Intel’s suite of software tools for embedded processors. Intel VTune™ Amplifier 2014 for Systems is a part of Intel System Studio. This article will show how you can profile an embedded 4th generation Intel® Core™ processor (code named Haswell).

Background

The operating system this article is using is Yocto Project* 1.5, it includes support for building systems that run on Haswell. Before we can use VTune Amplifier 2014 for Systems we first need to build the Yocto Project* operating system, this is a necessary step because we need to build some device drivers operating that require some kernel sources. In order to build the Yocto Project* operating system, see the getting started guide: http://www.yoctoproject.org/docs/current/yocto-project-qs/yocto-project-qs.html. Run the following commands. You should first Install the cross compiler toolchain (default is /opt/poky/1.5) and source /opt/poky/1.5/environment-setup-x86_64-poky-linux.

Run the following commands to setup a build environment for Yocto Project 1.5.

     $ wget http://downloads.yoctoproject.org/releases/yocto/yocto-1.5.1/poky-dora-10.0.1.tar.bz2

     $ tar xjf poky-dora-10.0.1.tar.bz2

     $ cd poky-dora-10.0.1

     $ source oe-init-build-env

Next edit the file build/conf/local.conf and modify the MACHINE variable to be genericx86-64. Then type the command “bitbake core-image-sato”, this will build the sources and the image that will run on your Haswell processor.

Building the VTune Amplifier 2014 for Systems device drivers

Install the cross compiler toolchain (default is /opt/poky/1.5)

source /opt/poky/1.5/environment-setup-x86_64-poky-linux

Build the sepdk sampling driver:

./build-driver --c-compiler=x86_64-poky-linux-gcc --kernel-src-dir= /work/yocto/genericx86-64-dora-10.0.0/build/tmp/work/genericx86_64-poky-linux/linux-yocto/3.10.11+gitAUTOINC+363bd856c8_702040ac7c-r0/linux-genericx86-64-standard-build  --make-args="PLATFORM=x32_64 ARITY=smp" --kernel-version=3.10.11-yocto-standard -ni

Build apwr power driver:

./build-driver --c-compiler=x86_64-poky-linux-gcc --kernel-src-dir=/work/yocto/genericx86-64-dora-10.0.0/build/tmp/work/genericx86_64-poky-linux/linux-yocto/3.10.11+gitAUTOINC+363bd856c8_702040ac7c-r0/linux-genericx86-64-standard-build/  --make-args="ARITY=smp KERNEL_VERSION_FULL=3.10.11-yocto-standard PLATFORM=x32_64" -ni

Loading the device drivers on your system

  1. Copy the sepdk and powerdk directories to your embedded system using scp.
    1. sepdk
      1. scp –r sepdk root@target-ip:/home/root
      2. Login to target
        1. cd /home.root/sepdk/src
        2. insmod-sep3 –re
    2.  powerdk
      1. scp –r powedk root@target-ip:/home/root
      2. Login to target
        1. cd /home.root/powerdk/
        2. insmod-apwr

VTune™ Amplifier for Systems target executables do not work on Yocto Project 1.5 x64 due to different paths to ld.

They failed with message: 
"
-sh: ./amplxe-runss: No such file or directory
"
Usually ld is located in /lib64/ld-linux-x86-64.so.2 but on Yocto 1.5 x64 it is located in 
/lib/ld-linux-x86-64.so.2
 
  • The workaround is to create "/lib64/ld-linux-x86-64.so.2 " as a symlink to /lib/ld-linux-x86-64.so.2 

Running a remote collection

The amplxe-cl program now contains the option to specify a remote target.  There are many types of collections that are supported using amplxe-cl. Before running amplxe-cl these are the steps you need to take.

  1. Build and load the performance sampling driver (sepdk) for your target.
  2. Copy the target remote directories to your target.
  3. Set the AMPLXE_TARGET_PRODUCT_DIR env variable specifying where you copied the directory in step 2.
  4. Set up your target so that it does not require an ssh password.
    1. cat ~/.ssh/id_dsa.pub | ssh root@ip_target “ cat >> /home/root/.ssh/authorized_keys”

For this article we will be running an advanced hotspots collection. Note:  You can get access to help for amplxe-cl by running amplxe-cl –help. You can get a full list of supported collection types by running amplxe-cl –help collect.

    amplxe-cl –target ssh:root@ip_addr –collect advanced-hotspots -d 5 --search-dir bin:p=<local directory containing modules>

·         -target ssh:root@ip_addr

o    This specifies that this will be a remote collection over ssh to the system running at internet address ip_addr.

·         -collect advanced-hotspots

o    This specifies that we will be collecting a advanced hotspots collection. This will give us information on where you are spending time in your program.

·         -d 5

o    This specified the collection will run for 5 seconds.

·         --search-dir bin:p=<local directory containing modules>

o    For system-wide collection, a lot of modules running in the system during the collection are copied from the target to the host, which may take a while. However, this only happens once since amplxe-cl caches the target system modules on the host for faster access during the next collection. If you do not want the command to take the modules from the device you can specify a local directory on the host to be searched first as above. See the VTune Amplifier 2014 for Systems help for more information

Displaying results in the VTune Amplifier 2014 for Systems GUI

Run the following command: amplxe-gui r000

 

有关编译器优化的更完整信息,请参阅优化通知