Compilation of atomic reads into 3 identical loads

Compilation of atomic reads into 3 identical loads


this is more out of curiosity than anything else. When looking at the generated assembly code for a tight loop that polls an atomic state member until it has a certain value, I see that the read of the atomic variable is translated by the compiler (gcc 5.2.0 x64) into 3 identical loads (as shown by the assmbly view in vTune). So:

while (m_state == TS_BUSY_WAITING) { ASM_PAUSE; }

turns into

Block 7:
movl  0x3c(%rdi), %eax
movl  0x3c(%rdi), %eax
movl  0x3c(%rdi), %eax
cmp $0x3, %eax
jz 0x1b5af88 <Block 7>

Notice the 3 identical movl operations.
What is the cause behind this translation? I see a similar translation also in other places where tbb::atomic is being used.
publicaciones de 3 / 0 nuevos
Último envío
Para obtener más información sobre las optimizaciones del compilador, consulte el aviso sobre la optimización.

Hi Stephan,

Thank you for the report. After investigation it looks like a GCC issue. I have created Bug 84151 in GCC Bugzilla.


And here I am thinking this would be some kind of cache-line vodoo to improve performance :-) I'm glad I asked.

Thanks for taking care of this!


Deje un comentario

Por favor inicie sesión para agregar un comentario. ¿No es socio? Únase ya