Hi, I have installed Vtune on 2 Linux computers (Debian Lenny, kernel 2.6.26-1-amd64), one with the Vtune Analyser only (PC1), the other with the Collectors only (PC2) to do remote analysis. On the PC1, I have installed the VTune plugin to my eclipse (Option 2 "Integrateto" in Eclipse Options menu). On the PC2, I have installed the driver for Vtune, the vtserver ... I can do remote analysis on the PC2 from the PC1, all it's OK for VTune ... Now, I would like install PTU : I would like integrate the PTU plugin too my eclipse on the PC1.
Intel® Performance Tuning Utility
Intel(R) PTU 4.0 U3 has been released
http://software.intel.com/en-us/articles/intel-performance-tuning-utility/
We are glad to announce new public release!
This version of the Intel Performance Tuning Utility introduces the following new features and enhancements:
what deos this error mean
undefined reference to `ippJumpIndexForMergedLibs'
Which dynamic library should i link to in ipp to remove this error
Cache-references and Cache-misses counters
I hope this is an appropriate place to post my question.
Using the linux /proc/mtrr i have configured all physical memory space to be uncachable.
I have then ran 'perf stat myapp' and looked on the cache counters for references and misses.
Since i have used mtrr to set all physical memory as uncachable, i was expecting the
cache misses to be 0 (zero), as uncachable memory should not be referencing the cache to begin with..
Not able to install Intel PTU driver
Hi,
My system is : Ubuntu 10.04 64 bits. If I build the PTU's vdk driver, it complains:
yuantang@Octave:~/tool_src/Intel_PTU/ptu32_001_lin_intel64/vdk/src$ sudo ./build-driver
[sudo] password for yuantang:
Options in brackets "[ ... ]" indicate default values
that will be used when only the ENTER key is pressed.
C compiler to use: [ /usr/bin/gcc ]
Make command to use: [ /usr/bin/make ]
Kernel source directory: [ /lib/modules/2.6.322.6.32-24-generic-beta/source ]
basic data access profiling using the load latency event
Hi,
I did some experiments with the load latency event of the Intel Nehalem (MEM_INST_RETIRED.LATENCY_ABOVE_THRESHOLD). The machine I'm doing the experiments on has two Xeon E5520 processors. As I'm mostly interested in high latency DRAM accesses, I thought that by setting the threshold to a value larger than the latency of the on-core caches, I would mostly get samples with DRAM operations. To my surprise, the percentage of off-core samples doesn't substantially increase with large thresholds. The table below shows the results:
vtdpview read timing information
Hi,
I would like to do a time-series analysis of the memory behavior of some programs. For this purpose I invoke Intel PTU 3.2 in the following way:
./vtsarun -dl -ec "MEM_LOAD_RETIRED.LS_MISS":sa=100 -- .
Intel Vtune Event CPU_CLK_UNHALTED.CORE
Here is a fundamental query about Vtune event CPU_CLK_UNHALTED.CORE . In case of multi-threaded applications running on multi-cores, how does Vtune count this event? For instance, if thread 1 of the multithreaded application runs for x cycles (unhalted) on core 1 and thread 2 runs for y cycles (unhalted) on core 2, what would be the value of CPU_CLK_UNHALTED.CORE ? Will it be (x+y) cycles?
Intel Vtune Event CPU_CLK_UNHALTED.CORE
Here is a fundamental query about Vtune event CPU_CLK_UNHALTED.CORE . In case of multi-threaded applications running on multi-cores, how does Vtune count this event? For instance, if thread 1 of the multithreaded application runs for x cycles (unhalted) on core 1 and thread 2 runs for y cycles (unhalted) on core 2, what would be the value of CPU_CLK_UNHALTED.CORE ? Will it be (x+y) cycles?
Intel Vtune Event CPU_CLK_UNHALTED.CORE
Here is a fundamental query about Vtune event CPU_CLK_UNHALTED.CORE . In case of multi-threaded applications running on multi-cores, how does Vtune count this event? For instance, if thread 1 of the multithreaded application runs for x cycles (unhalted) on core 1 and thread 2 runs for y cycles (unhalted) on core 2, what would be the value of CPU_CLK_UNHALTED.CORE ? Will it be (x+y) cycles?
