I am trying to learn my way around the SSE compiler intrinsics, but am getting some errors I can't figure out... Hopefully someone point me in the right direction.
I have two arrays of 16 bit integers that I am trying to multiply. I want to compare the speed of a standard for loop to a loop I write with intrinsics, so here is how I am declaring the arrays:
short *array1 = new ( _mm_malloc( sizeof(short[length1/2]), 16)) short;
where length1 is the byte length of the file I will be reading the array from. I am declaring two more arrays identically...
When it comes time to do the SSE math, here is my code:
for (int i = 0; i < max_index; i=i+8)
a = _mm_load_si128( (__m128 *) array1 + i*2);
b = _mm_load_si128( (__m128 *) array2 + i*2);
c = _mm_mullo_epi16( a, b);
_mm_store_si128( (__m128 *) out2 + i*2, c);
every time I try to read or write the value of a,b, or c, it points to variable and gives the error:
no suitable constructor exists to convert from "__m128" to "__m128"