Switches Between Decoded Instruction Cache and the Legacy Front End Pipeline

Switches Between Decoded Instruction Cache and the Legacy Front End Pipeline

TITLE: Switches Between Decoded Instruction Cache and the Legacy Front End Pipeline

ISSUE_NAME: DECODED_ICACHE_SWITCH_PENALTY

DESCRIPTION:

The Decoded ICache has many advantages over the legacy decode pipeline. It eliminates

many bottlenecks of the legacy decode pipeline such as instructions decoded

into more than one micro-op and length changing prefix (LCP) stalls.

A switch to the legacy decode pipeline from the Decoded ICache only occurs when a

lookup in the Decoded ICache fails and usually costs anywhere from zero to three

cycles in the front end of the pipeline.
RELEVANCE:
This performance issue only impacts architectures code-named Sandy Bridge and Ivy Bridge.

EXAMPLE:

The Decoded ICache events all have large skids and the exact instruction where they

are tagged is usually not the source of the problem so only look for this issue at the

process, module and function granularities.

Determining cost of switches from the Decoded ICache to the legacy decode pipeline.

% DECODED_ICACHE_SWITCH_PENALTY =

100 * DSB2MITE_SWITCHES.PENALTY_CYCLES / CPU_CLK_UNHALTED.THREAD;

Determining the average cost per Decoded ICache switch to the legacy front end:

AVG.DECODED_ICACHE_SWITCH_PENALTY =

DSB2MITE_SWITCHES.PENALTY_CYCLES / DSB2MITE_SWITCHES.COUNT;

 

SOLUTION:

There are no partial hits in the Decoded ICache. If any micro-op that is part of that

lookup on the 32-byte chunk is missing, a Decoded ICache miss occurs on all microops

for that transaction.

There are three primary reasons for missing micro-ops in the Decoded ICache:

1)   Portions of a 32-byte chunk of code were not able to fit within three ways of the Decoded ICache.

2)   A frequently run portion of your code section is too large for the Decoded ICache. This case is more common on server applications since client applications tend to have a smaller set of code which is "hot".

3)   The Decoded ICache is getting flushed for example when an ITLB entry is evicted.

 

RELATED_SOURCES:
NOTES:

1 post / novo 0
Para obter mais informações sobre otimizações de compiladores, consulte Aviso sobre otimizações.