Hi, I'm trying to measure, in some detail, how my SB is hardware prefeching from the L1D. The documentation for SB is somewhat lacking, compared to detail from NH. NH had the following documentation for HW prefetches from the L1D:http://software.intel.com/sites/products/documentation/hpc/amplifierxe/en-us/lin/ug_docs/reference/index.htm#snb/events/about_front_end_performance_tuning_events.htmlwhile on SB it only uses 1 of the Unit masks (2) for misses. I'm measuring unit mask 1 and 4 as well which appear to work.Can someone confirm that measuring unit mask 2 for PMC 0x4E measures ALL hardware prefetch misses to the L1D? If there's any more detail that can be provided as to whether unit mask 0x01 and 0x04 work and what they measure that's appreciated as well.Thanksperfwise
For more complete information about compiler optimizations, see our Optimization Notice.