How do I avoid XMM book-keeping code around __asm blocks in 64-bits

How do I avoid XMM book-keeping code around __asm blocks in 64-bits

imagem de Jean J.

Greetings,

When I write x86_64 assembly blocks, I saw that the compiler is generating book-keeping code to preserve values of XMM8 to XMM15. So I tend to use them but sometimes we really need all XMM 16 registers. The problem is that the book-keeping code is a fixed cost that could be avoided and sometimes it invalidates our optimizations.

Is there any way to avoid preserving those registers? Any calling convention to do this?

Many thanks,

Guillaume Piolat

7 posts / 0 new
Último post
Para obter mais informações sobre otimizações de compiladores, consulte Aviso sobre otimizações.
imagem de Sergey Kostrov

- Is it possible to provide an example of codes ( C/C++ and assembler ) that demonstrates the issue?

- Could you use Intel® Software Development Emulator ( Intel® SDE ) to verify that you don't have SSE-to-AVX and AVX-to-SSE transitions ( this is only my suggestion and I could be wrong ).

imagem de Jean J.

Hi Sergey, here is a sample code that does this.

void test()
{
    __asm
    {
        pxor xmm8, xmm8  // could be whatever using xmm8
    }
}

int main(int argc, char* argv[])
{
    #pragma noinline
    test();
}

The generated code for the test function is:

sub rsp, 56
movaps XMMWORD PTR [32+rsp], xmm8
pxor xmm8, xmm8
movaps xmm8, XMMWORD PTR [32+rsp]
add rsp, 56
ret

The compiler is able to see that we do'nt modify rbx, rbp, rsi, rdi, r12, r13, r14, r15, xmm6, xmm7, xmm9, xmm10, xmm11, xmm12, xmm13, xmm14 and xmm15 so it doesn't generate book-keeping code for them, but still that's a lot of registers we are not allowed to use without penalty!

About SSE-to-AVX, I don't think we use AVX code.

imagem de Sergey Kostrov

>>...The compiler is able to see that we do'nt modify rbx, rbp, rsi, rdi...

Your problem could be "transformed" to a good feature request, something like:

#pragma donotsavexmmregs

Did you consider a pure-assembler implementation of some functions you need?

imagem de Jean J.

> Did you consider a pure-assembler implementation of some functions you need?

Not yet. I think the "problem" is that the compiler keep register values to conform with the calling convention, yet I'm not sure what other calling conventions would allow to use more registers.

imagem de Sergey Kostrov

>>...Not yet. I think the "problem" is that the compiler keep register values to conform with the calling convention...

I think this is the only solution at the moment ( I mean pure-assembler implementation ). For example, TBB library has the same issues and there is a small set of functions implemented in pure-assembler.

Note: Also, __declspec( naked ) directive would not help because it is Not supported on 64-bit platforms.

imagem de jimdempseyatthecove

I suggest you write the function as a seperate function in C++, compile with assembler listing, edit listing to remove what you think is unnecessary code. Remove the .cpp file from the project, add the .asm file to the project (you may requie to adjust the solution/make to generate the .obj from the .asm). I've had to rely on this myself due to inline assembler not supporting the code I wanted.

Jim Dempsey

www.quickthreadprogramming.com

Faça login para deixar um comentário.