Using LEA instructions

Using LEA instructions

In his useful paper "Assembly Language Tips & Tricks for the Intel Pentium 4 Processor", Khang Nguyen suggested the following:

  • Using lea Instructions:

mov edx,ecx
sal edx,3
Faster:

lea edx, [ecx + ecx]
add edx, edx
add edx, edx

While I suppose

XOR EDX,EDX

SHRD EDX,ECX,29

has the same issues as first fragment, would not

LEA EDX,[ECX*8] (seven bytes)

or, if you have another spare zero register, say EBX, a denser encoding would be

LEA EDX,[EBX+ECX*8] (three bytes)

Would these not be faster, with the further advantage of changing no flags? /Roy Sykes

4 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

We will forward your question to the author and let you know what response we receive.

For those following along, here is a link to the article.

Regards,

Lexi S.

IntelSoftware NetworkSupport

http://www.intel.com/software

Contact us

Message Edited by intel.software.network.support on 12-09-2005 10:48 AM

The author responded as follows:

The reason the combination of lea and add is faster because it gets away from the shifting instruction.

Lets look at the statement:

LEA EDX,[ECX*8] (seven bytes)

This operation involves multiplication (shifting) which is known to be slowed.

The following statement is even worse:

LEA EDX,[EBX+ECX*8] (three bytes)

This operation also involves multiplication and another operation to set the register EBX to zero.

Hope this helps!

==

Regards,

Lexi S.

IntelSoftware NetworkSupport

http://www.intel.com/software

Contact us

Message Edited by intel.software.network.support on 12-02-2005 08:50 PM

This is for Lexi

The "LEA EDX,[ECX*8]" instructionalthough appears text wise to involve multiplication (*) it doesnot. Also, this does not involve a shift operation. If either were true then your processor wizards would need to go back to school. Simple shift-like operations for *1, *2, *4, *8 are so common that are hardwired into the archetecture. All permutations arealways presenta multiplexer selects the desired result.

Jim Dempsey

Leave a Comment

Please sign in to add a comment. Not a member? Join today