I 'am trying to "port" my java special functions class to pure x86 assembly.In my project i use SSE and SSE2instrustion setoperating on fp REAL4 values.I would like to use movaps instruction because of timing (less cpi than movups),but my program crashes with "access violation" error.While debugging i have found thanerror is caused by movaps instruction trying to access stack values local to the procedure(addressed by ebp-n) ebp is decremented by multiplies of 16.When i use movups the problem is absent.I tried to add align 16 directive but it does not work , so i'am stuck to less efficient instruction.
Here is my code snippet which calculates a few term of e^x taylor expansion.
movaps xmm0,one ;movaps works perfectly while accessing memory addps xmm0,argument ;1+x xmm0 accumulator mov eax,OFFSET coef1 movaps xmm1,[eax] rcpps xmm2,xmm1 ;1/coef1 movaps xmm3,argument mulps xmm3,xmm3 ;x^2 movups [ebp-16],xmm3 ;store x^2 ;here movaps crashes program mulps xmm2,xmm3 addps xmm0,xmm2 ;1+x+x^2/2! xmm0 accumulator mov eax,OFFSET coef2 movups xmm1,[eax] rcpps xmm2,xmm1 ;1/coef2 movups xmm7,argument movups xmm3,[ebp-16] mulps xmm3,xmm7 ;x^3 movups [ebp-32],xmm3 ;store x^3 mulps xmm2,xmm3 addps xmm0,xmm2 ;1+x+x^2/2!+x^3/3! xmm0 accumulator