Sampling API Error: resume sampling collection failed.

Sampling API Error: resume sampling collection failed.

Hi,

currently I am struggeling with the error as mentioned above.

One of our programmers made the effort to include performance analysis into our framework to perform a hotspotanalysis (VTune Amplifier XE Upd. 8), what is working quite nicely. But when I change the hotspotanalysis to a user defined hw event counting like this:

$AMPLXECMD -collect-with runsa -knob event-config="$HWEVENTS" -start-paused -follow-child \\
-target-duration-type=medium -no-allow-multiple-runs -no-analyze-system \\
-data-limit=500 -slow-frames-threshold=40 -fast-frames-threshold=100 \\
-r=$OUTPUTDIR/r@@@ -- $APPLICATION

Then I recieve the error message. Reducing the command did not show any improvements. The ide is to profile specific algorithms that appear in VTune then as tasks. If an algorithm is started there is a before hook asking for some customized code, where we put:

taskId = __itt_event_create(typeName.c_str(), typeName.size());
__itt_event_start(state.event);

and if started:

__itt_event_end(state.parent_event);

just before the start and so on. Between algorithms the profiling is paused and then resumed. Means it will be called with high frequency. Is this a problem? How could I fix it?

After browsing through the web, I did not found any solution. Has somebody any idea?

Thanks,
Stefan

4 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

I don't know why did you use pause mode, did you resume sampling in code?

Secondary, event start/end is only marked in timeline report. And it's for user-mode sampling (Hotspots, Concurrency Analysis, LocksAndWaits Analysis), NOT for PMU event-based sampling.

Here I gave you a simple example - matrix1.c
[cpp]#include
#include
#include

#include "ittnotify.h"

#define NUM 512

double a[NUM][NUM], b[NUM][NUM], c[NUM][NUM];

__itt_event event_matrix;

void multiply()
{
unsigned int i,j,k;

__itt_event_start(event_matrix);

for(i=0;i
gcc -g matrix1.c -I/opt/intel/vtune_amplifier_xe_2011/include /opt/intel/vtune_amplifier_xe_2011/lib64/libittnotify.a -lpthread -ldl -o matrix1

# amplxe-cl -collect hotspots -- ./matrix1
Elapsed time = 0.740000 seconds
Using result path `/home/peter/problem_report/r001hs'
Executing actions 75 % Generating a report
Summary
-------

Elapsed Time: 0.761
CPU Time: 0.750
Executing actions 100 % done

Open result from amplxe-gui, note "User Task" mark in timeline report

Thanks,

I think this clarifies why it is not running.

The only point is, that using tasks gives me another possibility to group processing time, right? But if I use frames, I can do the same and user event sampling is covered as well, isn't it? So where is the difference between events and frames?

Thanks,
Stefan

Using __itt_frame is another approach when you do same (similar) works in a loop, so all performance dataare classifiedin eachiteration, please see this article.

__itt_event provides APIstomark"event star/end"in timeline report, whereyou runcritical code. Usually use "zoom-in/filter on selection", tofocus on this time range to review result.

Regards, Peter

Leave a Comment

Please sign in to add a comment. Not a member? Join today