Cross-thread Stack Access

Hi All,

I'm running an OpenMP application (minimal example) :

     DO KR=1,100

            Array_Out(KR) = Array_Input(KR)



I verified that I do not have data race (I get the same results with only one thread), however when running the application in the Inspector XE 2013 - I get a massage that I have cross-thread stack accesses. 

How can I prevent this behavior, and what is the practical effect if not on the results ?

Thanks in advance for your replies,

Intel vtune is very slow in finalizing results(linux)


I'm using intel vtune amplifier 2015(linux version). my sample time of the work load is 180 seconds. I gave my SW build with debug symbols enabled. 

In vutune->project properties, I gave the path for the build and the source files and symbols. When i give re-resolve, vtune takes more than 1 hour to finalize and display results. The progress bar goes to 30% and remains stuck there and it says "finalizing results " for more than an hour.

What is the problem here. why does it take so long to display results when i hit re-resolve?

can SEP co-exist with perf driver?

If a system has perf driver installed, can we also install SEP driver?  I assume Vtune first checks for SEP and uses it if it finds it.  If it can't find SEP I assume it looks for a compatible perf driver, correct?

So there should be no issues or anything special to use VTune on a system with both SEP and perf drivers?


vtune remote analysis error


I setup Vtune on windows 8 to run the experiment using ssh on a linux server.

When i try to create an analysis the following message error appears:

"remote analysis error "detected Vtune Amplifier build #403110 on target system is incompatible with the build #410668 on the host". Package update on target is required.

Amplifier cannot detect remote machine configuration

What shoud i do to solve this incompatibility?




LLC miss attributed to wrong assembly statement

2016 Beta

When performing analysis to locate LLC misses, the miss count is attributed to the correct source code statement, but when opening the Assembly statement, the count is attributed to the first  assembly code statement for the source code statement and not to the correct assembly statement. For a simple fetch of data this is not much of a problem, however in a convoluted expression with multiple array references, it does not help much to figure out which reference caused the LLC miss.

Jim Dempsey

GPU profiling infor is incomplete



I am using the INtel Vtune 2013 version and whenever I do the profiling for Advanced hotspots with Trace the OPENCL kernels option enabled.

Attached is the screenshot for my GPU profiling. It doesn't show the info about EU array, sampler etc. Like shown in


VTune 2016 Beta fails on Hotspot and Advanced Hotspot


I am having endless problems getting the VTune 2016 Beta to profile my application. The application seg faults when run in VTune with a completely unhelpful stack trace despite debugging symbols being available. The application runs fine both through my debugger and on the command line.

I am currently trying to work through its warnings to figure out what might be the problem and one thing that sticks out is the following 

Suscribirse a Depuración