The percentage of pipeline slots occupied by uops which eventually retire
This category reflects slots utilized by good uops—i.e. allocated uops that eventually get retired. To calculate this, we use a legacy counter for retired uops (that is comparable to allocated uops):
Retiring = UOPS_RETIRED.RETIRE_SLOTS / (4*CPU_CLK_UNHALTED.THREAD)
Ideally, all pipeline slots would be attributed to the Retiring category. Retiring of 100% would indicate the maximum 4 uops retired per cycle has been achieved. Maximizing Retiring increases the Instruction-Per-Cycle metric (an average of one uop-per-instruction is observed in typical workloads).
Note that a high Retiring value does not necessary mean there is no room for better performance. Microcode assists typically hurt performance and should be avoided. They manifest under this category as assist uops eventually retire (this case can be detected by legacy counters like IDQ.MS_CYCLES). A high Retiring value for non-vectorized code may be a good hint for programmer to consider vectorizing his code. Doing so essentially lets more computations be done without significantly increasing number of instructions thus improving the performance.
Note that it is still possible to have performance issues or wasted work even when retiring at a good rate (e.g. floating point assist where many uops are retiring)
EQUATION: UOPS_RETIRED.RETIRE_SLOTS / (4*CPU_CLK_UNHALTED.THREAD)