I run 4 different threads simultaneously on 4 different cores of sandy bridge machine and want to count Resource stalls and L2 misses etc metrics per core basis. I use PAPI counters like RESOURCE_STALL:ANY and PAPI_L2_TCA on each thread. As PAPI counts on thread basis, it should give me the counts for every core separately as each thread is assigned to separate core. Is my approach right ? Or will there be any issues as all these threads are executed simultaneously ?
Para obtener más información sobre las optimizaciones del compilador, consulte el aviso sobre la optimización.