A few issues/questions with beta (API, Excluding Waits and Missing Calls in graph).

A few issues/questions with beta (API, Excluding Waits and Missing Calls in graph).

Imagen de Sebastien St-laurent

Hi all. I have started playing with Amplifier XE yesterday and have hit a few issues questions that I might need help with. My experience with Amplifier is limited as most profiling work in the past has been done with Vtune... 1) When looking at the profile data for a basic hotspot capture, all synchronization calls appear to count as busy CPU time so a large portion of time on some threads appear with WaitForObject at the top. Is there a way to exclude this or at least make sure only the real busy CPU time is reported for cases like these. Or at least some what to know which portion of this time was actually active versus the thread being asleep. 2) I have been noticing someinconsistenciesin the reporting between the bottom-up and top-down trees. In bottom up, one top function may show as a mid-level function which calls other functions (which are not inlined). Looking at the code/assembly view shows that most of the time is spent in some of the sub-functions called but are not reported. However, if I drill down the top-down tree, the sub-functions seem to be represented accurately. 3) In regards to API. The beta invite indicates an API can be used to mark frames to be used in the profile data. I would assume this is the ittnotify API? But in another non beta related article for Amplifier, it seems to imply the API calls supported are Pause/Resume/Mark. Is there some decent documentation somewhere as to which API calls can be used, how to use them and how they will affect the results in Amplifier XE? Thanks, Sebastien St-Laurent Neversoft Entertainment LLC

publicaciones de 3 / 0 nuevos
Último envío
Para obtener más información sobre las optimizaciones del compilador, consulte el aviso sobre la optimización.
Imagen de Peter Wang (Intel)

Hello,

Thanks for your trying VTune Amplifier XE Beta!

1. If you use Hotspot analysis, all sync objects calls will be counted in corresponding functions. That means, WaitForObjects will be indicated as Running but CPU Time is less.

You may use "Locks & Waits" analysis, the indicator "Waits" is clear in function MainCRTStartup.

2. The bottom-up report gives total time spendingforeach function,and how many time is from caller1, caller2, caller3...
The top-downreport gives time spendingfor each function, in call stack. The functioncould be appeared in different paths, and the time is not "total" value.
Please submit a ticket to https://premier.intel.comwith your test case (contents from result directory is OK), if you feel results are inaccuracy, or inconsistent. Thank you.

3.Since ittnotify API and VTuneAPI are not claimed/finalized and undocumented, I suggest not to use them for now.
Please wait for formal release, and check them in documents.

Thanks again to you, provides feedbacks to us.

Regards, Peter

Imagen de Sebastien St-laurent

Hello Peter and thank you for the responses. 1) That makes sense to me now. Just seemedawkwardat a first glance that sync waits appeared to be counted as active time. 2) The problem isn't necessarily aninaccuracy. But rather that the results do not drill down all the way to the lowest function that isn't optimized out (or inlined). So in some cases, I end up with one function taking a large lump of time but no break down of how much time is actually spent in all the functions that it calls. Will gather a test case and submit a report. 3) Ok, I will hold off on this for now. But there is an intent to have some type of marker which wouldallowto break up theanalysisinto sections (so we can know what isoccurringat a specific point in time)?

Inicie sesión para dejar un comentario.