inc/dec instruction vs macrofusion

inc/dec instruction vs macrofusion

Hi, the optimizations manual advises not to use the inc/dec instructions because they write only a part of EFLAGS register and this create a false dep with earlier instructions. On Sandy/Ivy Bridge inc/dec are listed as macro-fusable with jcc, so with macrofusion is the above advice still raccomanded or it's valid only for jumps on CF flag?.

If a pair like dec ecx; jz label is macrofused as a single u-op without false deps the encoding is more compact than sub ecx, 1 jz so there could be a reason to shift back to the old method.

Thanks

4 post / 0 nuovi
Ultimo contenuto
Per informazioni complete sulle ottimizzazioni del compilatore, consultare l'Avviso sull'ottimizzazione

The advice to avoid inc/dec stands if you intend to run on an earlier model Intel CPU.  You may have to run your own tests if you want to find out whether inc/dec are helping your application on Sandy Bridge.  The architectural change, as you indicated, should eliminate the earlier significant performance penalty.

Thanks for the clearing that up Tim :D

For example, on Sandy Bridge, the MSVC++ default /favor:blend is satisfactory in one of my benchmark suites, where earlier CPU models needed /favor:EM64T (alternate spelling /favor:INTEL64 for VS2012).  So it looks like there is no point in setting the /favor option when using /arch:AVX in the VS2012 compiler.  The current Intel compiler switches to inc/dec when compiling for AVX, in accordance with your suggestion.

Lascia un commento

Eseguire l'accesso per aggiungere un commento. Non siete membri? Iscriviti oggi