LLC miss attributed to wrong assembly statement

2016 Beta

When performing analysis to locate LLC misses, the miss count is attributed to the correct source code statement, but when opening the Assembly statement, the count is attributed to the first  assembly code statement for the source code statement and not to the correct assembly statement. For a simple fetch of data this is not much of a problem, however in a convoluted expression with multiple array references, it does not help much to figure out which reference caused the LLC miss.

Jim Dempsey

GPU profiling infor is incomplete



I am using the INtel Vtune 2013 version and whenever I do the profiling for Advanced hotspots with Trace the OPENCL kernels option enabled.

Attached is the screenshot for my GPU profiling. It doesn't show the info about EU array, sampler etc. Like shown in 



VTune 2016 Beta fails on Hotspot and Advanced Hotspot


I am having endless problems getting the VTune 2016 Beta to profile my application. The application seg faults when run in VTune with a completely unhelpful stack trace despite debugging symbols being available. The application runs fine both through my debugger and on the command line.

I am currently trying to work through its warnings to figure out what might be the problem and one thing that sticks out is the following 

VTune itself not multi-threaded so well

In performing a Memory Bandwidth, the "Processing profile metrics and debug information" takes an exceedingly long time to complete (5 to 10 minutes). In watching the Task Manager Performance monitor it appears that only 1, or occasionally 2 threads are involved in this step. Could you look at making this phase, multi-threaded, or better multi-threaded.

Jim Dempsey

sep driver permissions

I have a customer who set up permissions on the driver 600.  Obviously this is wrong since a non-root user in the vtune group will not have permissions.

I advised to insmod with permission  660 to restrict use to those in the vtune group.  Fine.  Now their question is - does the group perms NEED WRITE permission?  Shouldn't the perms be 640 - in other words, why does a user need write permission?



Conduct validation and debugging to meet industry compliance with Intel® Stress Bitstreams and Encoder.

To proof the compliance to the video standard (HEVC, VP9) Stress Bitstreams give a complete set on test vectors which can be used for short sanity check or full range validation. Given bitstreams will put decoder in the condition of worst case video decoding speed with stress on memory access and highest computational complexity. If ever seek for holes in implementation – a great opportunity to leverage aspiration for exploratory testing with Random Encoder. Every new “seed” will make a new bitstream with new cross syntax combinations.

  • Développeurs
  • hevc
  • VP9
  • Debug Bitstreams
  • Video Analyzer
  • Video Decoder
  • Decoder Verification
  • Débogage
  • Outils de développement
  • Traitement média
  • ModernCode Project - Intel and Partners Helping You Make Today’s Software Be Ready for Tomorrow’s Computers

    Today, we introduced the Intel® Modern Code Developer Community, which focuses on the pursuit of parallel programming.  The community includes our very successful series of Modern Code Live Workshops taught around the world and our upcoming Intel® HPC Developer Conferences.

    I use solaris -- any way to use VTune ?

    As VTune isn't ported to solaris, i'd like to know if there is any way to trick the software to identify hot spots, or for that matter, provide any information about problems I need to address.   Certainly I can cherry pick some source code and get it to compile and even run under LINUX, but that is going to be a manual process, and I'd really like to get as much code analyzed as possible, including (open)solaris kernel and driver code.

    Any suggestions? 

    S’abonner à Débogage