Uops from the Microcode Sequencer

Uops from the Microcode Sequencer

TITLE: Uops from the Microcode Sequencer

ISSUE_NAME: Frontend^UopSource_MS

DESCRIPTION:

This metric describes the percentage of uops delivered to the micro-op queue that came from the MS, which is the Microcode Sequencer.

RELEVANCE:

A high percentage of uops from the Microcode Sequencer can often indicate a performance issue.  However, this is not always the case because some instructions can generate a lot of uops from the Microcode Sequencer and yet be very efficient if used properly (e.g. rep strings).  If you are not front end bound and UopSource_MS is greater than ~25%, you may want to follow these steps to determine why micro-ops are arising from the microcode, from most common to least common:

1)      Long latency instructions – Any instruction over four micro-ops starts the microcode sequencer. Some instructions such as transcendentals can generate many micro-ops from the microcode.

2)      String operations – string operations can produce a large amount of microcode. In some cases there are assists which can occur due to string operations such as REP MOVSB with trip count greater than 3, which costs 70+ cycles.

3)      Assists – Can be things like floating point assists, transitions between Intel SSE and Intel AVX, and AVX store assists (i.e. AVX store spanning two pages)

EXAMPLE:

SOLUTION:

RELATED_SOURCES:

NOTES:

EQUATION:  IDQ.MS_UOPS / (IDQ.MITE_UOPS + IDQ.DSB_UOPS + IDQ.MS_UOPS)

1 post / 0 new
For more complete information about compiler optimizations, see our Optimization Notice.