errors with SSE data types

errors with SSE data types

Portrait de psheph

I am trying to learn my way around the SSE compiler intrinsics, but am getting some errors I can't figure out... Hopefully someone point me in the right direction.

I have two arrays of 16 bit integers that I am trying to multiply. I want to compare the speed of a standard for loop to a loop I write with intrinsics, so here is how I am declaring the arrays:

short *array1 = new ( _mm_malloc( sizeof(short[length1/2]), 16)) short;

where length1 is the byte length of the file I will be reading the array from. I am declaring two more arrays identically...

When it comes time to do the SSE math, here is my code:

__m128 a,b,c;

for (int i = 0; i < max_index; i=i+8)
{
a = _mm_load_si128( (__m128 *) array1 + i*2);
b = _mm_load_si128( (__m128 *) array2 + i*2);

c = _mm_mullo_epi16( a, b);

_mm_store_si128( (__m128 *) out2 + i*2, c);
}

every time I try to read or write the value of a,b, or c, it points to variable and gives the error:

no suitable constructor exists to convert from "__m128" to "__m128"

Any suggestions?

Thanks,
Paul

4 posts / 0 nouveau(x)
Dernière contribution
Reportez-vous à notre Notice d'optimisation pour plus d'informations sur les choix et l'optimisation des performances dans les produits logiciels Intel.
Portrait de lionelk

Hello psheph,
did you " #include "xmmintrin.h" " ? According to info I recieved you need this header to implement the intrinsics (but I only use SSE, you might need a different one for MMX). Also, see my post "usage of _mm_prefetch(...)" on page 3. Although it deals with prefetching, it has a couple of lines of SSE intrinsic stuff that seems to do something similar to what you are trying to do, and I know this works.

Are you sure you should have that "i*2" at the end of your "a" and "b" intrinsic data type declarations?

Portrait de bronx

at 1st sight your code looks strange on some points, for example the new that call _mm_malloc, _mm_malloc isn't an allocator to use as a "placement new" but already returns a pointer, just do smthg like :

short *array1 = (short *)_mm_malloc(sizeof(short)*length,16);

the prototype for "_mm_load_si128" use "__m128i" as parameters so including and replacing __m128 by __m128i should solve your compilation issues at least

Portrait de Yaroslav Morkovnikov (Intel)

Try
typedef union {
int in[2];
__m128 out;
} cvt;

cvt convert_tmp;

convert_tmp.in[0] = ...
convert_tmp.in[1] = ...
a = _mm_load_si128(convert_tmp.out);

Connectez-vous pour laisser un commentaire.