Intel® VTune™ Amplifier XE

How can I profile the java application for hotspot analysis?

Hello,

I am running the wordcount example of hadoop on intel vtune for hotspot analysis. Command is bin/hadoop jar /usr/local/hadoop/hadoop-examples-1.2.1.jar wordcount inputfile outputfile. Hotspot analysis provides the modules information libzip.so, [compiled java code], [Dynamic code], libjvm.so, libc-2.17.so, libz.so.1.2.8 and [unknown]. My understanding is that [unknown] module is the wordcount application. However, to be sure I want to see what function inside the wordcount example are the hotspot functions. 

"Cannot locate file vtsspp.ko"

Today I started an evaluation of VTune Amplifier for Linux. (2015 edition, update 1.) I installed it on 'Ubuntu 14.04' with sudo root. The installation process reported success. Yet, when I do an advanced hotspot analysis, I get this warning: Cannot locate file 'vtsspp.ko' When installed with sudo root, and the installation process succeeds, should the kernel module not be built properly and be able to be found? When I search for the module with find, it yields nothing: $ find /lib/modules -name vtsspp.ko -print $ Thanks, Bram Stolk

“memory bound” metric in Vtune

Hello,

I am analyzing “memory bound” metric in my code with Vtune. According to "Intel® 64 and IA-32 Architectures Optimization Reference Manual-B.3.2.3":

%L2 Bound =(CYCLE_ACTIVITY.STALLS_L1D_PENDING - CYCLE_ACTIVITY.STALLS_L2_PENDING)/CLOCKS

But in my Vtune results, CYCLE_ACTIVITY.STALLS_L1D_PENDING is smaller than CYCLE_ACTIVITY.STALLS_L2_PENDING, why?

ittnotify in xeon phi offload regions

I'm trying to call the functions __itt_pause() and __itt_resume() from within offload regions so that I can only sample specific sections of my code in a knc-bandwidth collection. However I'm having problems with linking that I can't figure out. I am using Intel(R) VTune(TM) Amplifier XE 2015 Update 1.

My code looks like:

#pragma offload_attribute(push,target(mic))
#include <ittnotify.h>
#pragma offload_attribute(pop)

...

#pragma offload target(mic:offload_target)
{
    __itt_resume();
}

My make options are:

A 'Failed to create sampling data base' Problem

I am newer to VTune, and using VTune performance analyzer v9.1 on Windows XP(Intel Pentium G3420).

When I try to log Clockticks information, the VTune always show erro that "Failed to create sampling data base. probably .tb5 files are corrupted or don't exist".

When using "Quick performace analysis wizard", the "Clockticks" column always be zero('0') while other columns seem normal.

By the way, the programma to be analysied is development with Visual Studio 2008, MFC.

Can any one help me with this.

Thank u.

 

 

 

 

 

Centos7 kernel oops when running

When evaluating the vtune_amplifier_xe_2015.1.0.367959 on Linux I experienced a kernel oops in the vtune kernel modules. I was trying to run the microarchitecture -> general exploration -> bandwidth test. Centos 7 x86 default install updated with all patches. Code was running on SNB machine with the vtune CLI_install installed as per manual.

sfdump5 tool in VTune

There seems to be some mention of a command-line sfdump5 tool that can be used to process/view the samples within a .tb6 file. There is also documentation of this tool in the 3.11 revision of the SEP User Guide.

However, I can't seem to find this tool in the latest VTune Amplifier XE installation - C:\Program Files\Intel\VTune Amplifier XE 2015\bin32.

Has this sfdump5 tool been deprecated?

insufficient virtual memory

I am getting an error while loading a result from a general exploration experiment.  This is an MPI executable, but collection was only done for one process on each node.  See attached PNG for the error message.  Any ideas?

limit
cputime      unlimited
filesize     unlimited
datasize     4096000 kbytes
stacksize    7340032 kbytes
coredumpsize unlimited
memoryuse    1024000 kbytes
vmemoryuse   unlimited
descriptors  65536
memorylocked unlimited
maxproc      600

Call stack mechanism implementation question

I am running a Go program with dwarf information and VTune does a good job figuring out line numbers and so forth but it struggles with stack walks. I am guessing that it is because Go's stack conventions, how Go uses EBP for example, are different than those supported by Vtune. Is there a document or some sort of clue sheet about what Vtune expects from the stack formats? Also can anyone think of a work around that doesn't require Go changing its conventions.

Suscribirse a Intel® VTune™ Amplifier XE