Consecutive load operations results problem

Consecutive load operations results problem

I amdoing consecutive load operations using _mm_loadu_si128() in my appl.. The two load operations using this instruction are using addressesas m1+len+h. First load operation uses xm1=_mm_loadu_si128(m1+16-1) , and second load operation uses xm2=_mm_loadu_si128(m1+16+0). I expect xm1 and xm2 to be similar except for the m128i_i8[15] when xm1 is shifted by left by 1. But, the result is something else. None of the 8-bit elements are same between xm1 and xm2. Is it something with memory address alignment; but _mm_loadu_si128() is supposed for non-aligned also.

Hoping for getting quick suggestions on this.


3 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.


Is m1 a pointer to __m128i? In this case, adding 15 to m1 will point to m1 with an offset of size_of(__m128i)*15=16*15 Bytes.

I think that the following is what you want:

xm1 = _mm_loadu_si128((__m128i*) (((char*)m1)+16-1));

Kind regards


Hello Thomas,

Thanks a lot for that insight..

With Regards,


Leave a Comment

Please sign in to add a comment. Not a member? Join today