movq doesn't work, it never move the data

movq doesn't work, it never move the data

I am coding a function using the asembly code and compile it using the MASM. The code is as follows:

mcasm_j_sse2_10bits proc syscall ;u8 *ref_y, u8 *curr_y, s32 ystepz, s32 curr_wh


mov ebp, esp

sub esp, 32*24+80+4

and esp, 0fffffff0h

mov dword ptr[esp+32*24+80], ebp

mov esi, edx ;back of the curr_y

mov edx, esp

push ebx ;curr_wh

push eax ;ystep

call mcasm_vext_sse2_10bits

mov edx, esi
add esi, 8

mov ecx, esp


movq xmm0, mmword ptr[ecx+ 0]

movq xmm1, mmword ptr[ecx+ 32]

movq xmm2, mmword ptr[ecx+32*2]

movq xmm3, mmword ptr[ecx+32*3]

movq xmm4, mmword ptr[ecx+32*4]

movq xmm5, mmword ptr[ecx+32*5]

outputmmy ecx, 32

outputxmmx xmm0

outputxmmx xmm1

outputxmmx xmm2

outputxmmx xmm3

outputxmmx xmm4

The function firstly get a big buffer from the stack, then call the mcasm_vext_sse2_10bits to write the data to the buffer in the stack. Next, the function read the buffer using the movq to xmm register. But I found that the movq failed that the xmm register is zero. I print the data in the buffer ecx and the xmm0-xmm4 register to the screen, the resultsare that thedata in ecx is right,butxmm0-xmm4 are all zero, the movq failed.
The macro outputxmmx is to print the data in the xmm register.
The macro outputmmy x,y is to print the buffer data to screen, it is defined as follows:

outputmmy macro xxx, yyy

push eax

push edx

push ecx

lea ecx, [xxx]

mov edx, yyy

call @printmemory@8

pop ecx

pop edx

pop eax

The @printmemory@8 is defined in a .cpp file as follows:

extern "C" __declspec(noinline) void FstCllCvsn printmemory(s16 *psrc, int len)


printf("srcaddr: %10x len %6d\\n", psrc, len);

for(int i=0;i


printf("%8x ", psrc[i]);





4 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

I find more strange thing. When the asm code access the stack, some other xmm registers are set to zero automatically. For example,

outputxmmx xmm5

outputxmmx xmm6

pmaddwd xmm1, xmmword ptr[eax+16] ;mul (1,-5)

outputxmmx xmm5

The xmm5 is 0x25660000eb650000c3ec0000f8660000 originally, but after the instruction " pmaddwd xmm1, xmmword ptr[eax+16] ;mul (1,-5)", it becomes zero. The eax pointer to a memory in stack.

Best Reply

can you post outputmmx macro?
keep in mind that xmm registers xmm0-xmm5 are not volatile and their value might change in your function call to printf

EXTRN ?print128u8@@YAXT__m128i@@@Z:PROC

outputxmmx macro xx

sub esp, 16

movdqu [esp], xmm0

push eax

push ecx

push edx

movdqa xmm0, xx

call ?print128u8@@YAXT__m128i@@@Z

pop edx

pop ecx

pop eax

movdqu xmm0, [esp]

add esp, 16


The ?print128u8@@YAXT__m128i@@@Z is defined in a .cpp file as follows:

void print128u8(__m128i p)


u16 ds[8];

u8 *pp=(u8*)ds;

_mm_storeu_si128((__m128i*)ds, p);

for(int i=0;i<16;i++)

printf(" %4x", pp[i]);



Leave a Comment

Please sign in to add a comment. Not a member? Join today