Hi, I have found the perf events documented to be very helpful in previous emails, so thank you for providing information as to their behaivor. I was looking at the stats on LD behavior and the memory ordering buffer. I have some quesitons on the behavior of the hardware and what the stats measured at the link below refer to:http://redfort-software.intel.com/sites/products/documentation/hpc/amplifierxe/en-us/2011Update/lin/ug_docs/reference/index.htm#pmn/events/about_front_end_performance_tuning_events.html1) the MOB is for STLF interactions, right? 2) how is the MOB used? Is it just for STLF?3) I wasn't aware there was a reservation station in SB/IV, is there, I thought all results were sent from Sched -> EX -> LD buffer? 4) does unit mask 0x7 signify all loads executed from the scheduler?4a) does unit mask 4 signify all loads performed from the MOB, i.e. they are getting there results from a previous STORE?4b) does unit maks 2 signify the result of the STORE is not in the MOB yet, but waiting a cycle allows the uop to get it from the MOB?4b*) why does 1 cycle make such a difference? what's the average STLF latency of writing the store to the MOB and then loading it back?4c) unit mask 1, does this signify the general case of loads SC->EX which have no STORE dependency and simply get their data from the L1D?Thanks for any clarifications.. looks like interesting stuff which can be performance eye opening.Perfwise
Nähere Informationen zur Compiler-Optimierung finden Sie in unserem Optimierungshinweis.