vtdpview read timing information

vtdpview read timing information

Hi,

I would like to do a time-series analysis of the memory behavior of some programs. For this purpose I invoke Intel PTU 3.2 in the following way:

./vtsarun -dl -ec "MEM_LOAD_RETIRED.LS_MISS":sa=100 -- .

I transform the resulting data with vtdbview into the vtune.db file which has sqlite3 format. As I would like to have the timing information, I thought of reading the contents of the EventSamples table in the database, because this has also a field called walltime. However, this table doesn't contain any data (as if the vtdbview tool wouldn't have converted _all_ the data into the sqlite format). Could you please tell me what I might be doing wrong?

Many thanks.

Zoltan Majo

publicaciones de 7 / 0 nuevos
Último envío
Para obtener más información sobre las optimizaciones del compilador, consulte el aviso sobre la optimización.

with above command line you collected only cache misses - no any other information.
(also sampling after value ("sa") is too small)

but regardless - vtsaview and vtdpview reports samples and events information - not time

run
vtsarun -start --
vtsaview
you will get 2 basic events. From CPU_CLK_UNHALTED.CORE you can evaluate time.

read user guide document and another .pdf-s docs that explain how to use the tool

Hi,

thanks for the quick response.

I think MEM_LOAD_RETIRED:L2_MISS is a precise event, so I also collected program counter and register values amongst others with the PTU tool. The graphical interface of the tool summarizes and interprets the samples that have been collected, and doesn't give me access to the collected data itself (which I would like to process and interpret myself). That is why I tried using the command-line utilitities to get more data. The vtsarun tool itself stores the precise data of the samples in a file *pebs, but that is in Intel proprietary format. I assumed that the visualization tools (vtsaview, vtdpview) transform this data into a more open format (sqlite), where the full data is available. But this doesn't seem to be true, Could I get access to the raw samples that the tools have gathered? I assumed that the 'EventSamples' table in the database would contain this data, but no matter how I try to convert the raw data, the table stays empty.

Thanks for your help.

Regards,

Zoltan

Zoltan,
yes - MEM...L2_MISS is precise event and the tool doesn't collect fixed counters events (CLK and INST_RETIRED, while it could) when explicitly asked to collect one event.

You undertood the logic of the tool right.
vtsaview and vtdpview aggreagate samples and put them into vtune.db. (formst of which we do not explain )

everything that shown in GUI could be retrived from command line. GUI takes data from vtune.db running vtsaview and vtdpview. (run "-help" )

With the current version - you can not get raw samples. Sorry.

might be there will be an update.

Hi Julia,

thanks for the explanation.

Having raw samples would be indeed nice. I'll check the updates of the tool.

What I would also consider very interesting is the distribution of memory accesses in time. I understand this as follows: given a specific moment during the runtime of the application, what address(es) or blocks of memory were accessed by which threads. Do you consider to incorporate something like this into future version of PTU?

Best regards,

Zoltan

Hi Zoltan,

distribution of the memory accesses in time is interesting but the thing is what do we want to do with them.
Also provided that we collect with sampling - the picture will be very sparse and only glaring pathologies could be seen there (?)

i can not comment on PTU.

if you interested - look for for instrumentation tool - Pin (also from Intel) - it is free for research and _very_ cool. (may be you already uses it) with it you will generate memory trace.

Hi Julia,

well, one could try different sample-after values to get a more complete picture of the memory behavior of the program. And I assume that hardware performance profiling could be still much less intrusive than an instrumentation-based scheme.

Thanks for your recommendation regarding PIN. I briefly looked at it, but I was not sure whether it works also for multithreaded programs without much trouble. I'll take another look then.

Best regards,

Zoltan

Deje un comentario

Por favor inicie sesión para agregar un comentario. ¿No es socio? Únase ya