I am trying to understand the performance counters related to L2 misses on Haswell microarchitecture. Can someone tell me why is L2_RQSTS:MISS counter value greater than OFFCORE_RESPONSE_1:ANY_REQUEST:ANY_RESPONSE? Sometimes these two counter values are very close but for some benchmarks, L2_RQSTS:MISS is around 20-30% more than OFFCORE_RESPONSE_1:ANY_REQUEST:ANY_RESPONSE. Is that because L2 misses for the cache line already being serviced do not generate offcore responses? Or is there any other reason? Thanks in advance.
For more complete information about compiler optimizations, see our Optimization Notice.