Identifying Data Block Access Latencies and Ping-Pong Activity across

Identifying Data Block Access Latencies and Ping-Pong Activity across

Can one measure on a per thread basis the actual cost of memory block access in multi-threaded code on a nehalem-EP platform using VTune? For instance can one measure the %age of memory accesses from the different memories per thread? Can the same infor be broken down per cache level hierarchy ?

Can one measure how often the same cache block ping-pongs among caches when threads running on different sockets compete for write access to it?

Can one measure teh cache miss rates / level per thread using VTune ?

thanks ...
Michael

R/D High-Performance Computing and Engineering
1 Beitrag / 0 neu
Nähere Informationen zur Compiler-Optimierung finden Sie in unserem Optimierungshinweis.