Recently, I have faced the problem of measuring the traffic between last level cache and main memory in my project. Basically, I have to measure the number of cache lines transferred between LLC and memory for _one_ core.
As far as I understood, on Core 2 Duo architecture, this problem can be solved by just using L2_LINES_IN:SELF (traffic from memory to L2 cache) and L2_M_LINES_OUT:SELF (writebacks from L2 to memory) events. However, Sandy\Ivy Bridge microarchitectures are quite different from Core 2, because the LLC now is Level3 cache and I haven't found any similar events like L3_LINES_IN:SELF or something like that.
Is there any way to measure the memory traffic properly on these architectures?
PS. I am using Linux Ubuntu 11.10 (220.127.116.11) and libpfm4.