About the x64 stack Alignment

About the x64 stack Alignment

Hi all!

I'm learning 64bit assembler,understand the x64 stack should be16-byte alignment, but I dont'tunderstandwhy?

Thanksfor youranswers!

6 Beiträge / 0 neu
Letzter Beitrag
Nähere Informationen zur Compiler-Optimierung finden Sie in unserem Optimierungshinweis.
Bild des Benutzers Michael Klemm (Intel)


Please correct me I am mistakten, but I guess that alignment stems from the time when cache lines in the caches where 16 bytes long. Having the stack aligned to 16 bytes as well provides a better alignment of the stack to the caches.

Today, the cache line size is a multiple of 16 bytes.

Oh, and with SSE, which uses 128bit registers, the 16-byte aligment is the most natural one, too.




this is due to a calling convention in x64 which requires the stack to be 16 bytes aligned before any call instruction. This is not (to my knwoledge) a hardware requirement but a software one. This provides a way to be sure that when entering a function (that is, after a call instruction), the value of the stack pointer is always 8 modulo 16. Thus permitting simple data alignement and storage/reads from aligned location in stack.



Thank you very much!

In particular, 16-byte stack alignment avoids the need to insert conditional code to align SSE objects, both when allocating stack, and when entering SSE loops. This avoids the run-time failures seen on 32-bit systems when a gcc compiled function is called by one compiled by another compiler. Also, it avoids the alignment-dependent numerical differences incurred by differing loop alignment adjustments.

32-bit linux in the past provided 8-byte alignment (pointers set to 8 modulo 16) so as to handle 64-bit objects efficiently. Certain 32-bit compilers for Windows "optimized" for varying alignment by avoiding the use of 64- and 128-bit moves.

x64 compilers can assume the presence of SSE registers, which on Windows have a calling convention associated with them (XMM6-15 are nonvolatile, aka callee-save). By maintaining a known stack alignment at the entry of functions, the compiler can safely use the more efficientmovdqa to save the nonvolatile registers rather than using the unaligned movdqu version.

Melden Sie sich an, um einen Kommentar zu hinterlassen.