IF -ELSE condition using MMX tech.intrinsics

IF -ELSE condition using MMX tech.intrinsics

need help, _mm_mullo_pi16() could not multiply big numbers.any suggestion to what i should do? just how do i multiply two 1-D arrays with big value(positive and negative)? x0,b,b1,s0,s1 are vectors (arrays) and _f, Adjust[], _scale are scalar integers.

// C++ codes

int x0 = s0 + s1;

if(x0 < 0)

short b[j] = (short)(-( (((-x0) * Adjust[_qm]) + _f) >> _scale ));

else

short b[j] = (short)( ((x0 * Adjust[_qm]) + _f) >> _scale );

// MMX intrinsic codes

__m64*b1 = (__m64*)b;

__m64 s0,s1,s2,s3,x0;

j=0;

__m64 r0,r1,t0,t1,t2,p0,p1;

r0 =_mm_set_pi16(Adjust[_qm],Adjust[_qm],Adjust[_qm],Adjust[_qm]);

r1 =_mm_set_pi16(_f,_f,_f,_f);

x0 =_mm_add_pi16(s0,s1);

t1 =_mm_cmpgt_pi16(_mm_set1_pi16(0),x0);

t2 = _mm_mullo_pi16((_mm_sub_pi16(_mm_setzero_si64(),x0)),r0);

t0 = _mm_mullo_pi16(x0,r0);

p0 =_mm_srai_pi16(,_scale );

p1 =_mm_srai_pi16(_mm_add_pi16(t2,r1),_scale );

b1[j] =_mm_or_si64(_mm_and_si64(t1,p0),_mm_andnot_si64(t1,p1));

8 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

what exactly are you trying to do? (in C)
there is also a mulhi for the upper part

x0 ={21000, 23000,-19000,-14000}
r0 ={11912,11912,11912,11912}

what i want is to multiply x0 and r0 using _m64 data type ie using _mm_mullo_ep16() and _mm_mulhi_ep16().

The results of your computation require 32 bit. If you compute the lower and the upper bits separately, this requires twice the number of multiplications. Instead, you can use SSE, where you can do 4 32-bit multiplications in 1 operation:

__m128i x1 = _mm_loadu_si128(&x0); // loads 16 Bytes (only 8 are used)
__m128i r1 = _mm_loadu_si128(&r0); // loads 16 Bytes (only 8 are used)
__m128i x_sse = _mm_cvtepi16_epi32(x1); // convert lower 4 16-bit values to 32-bit values with sign-extension
__m128i r_sse = _mm_cvtepi16_epi32(r1); // convert lower 4 16-bit values to 32-bit values with sign-extension
__m128i res_sse = _mm_mullo_epi32(x_sse, r_sse); // multiply 4 signed 32-bit values

seems MS visual studio 2008 does not recognise _mm_mullo_epi32() , only _mm_mullo_epi16. equally _mm_cvtepi16_epi32() is not recognised on __m128i data types. how do i proceed?

Make sure smmintrin.h is included. These are SSE4.x instructions. MS VS 2008 supports them.

thanks for all the help given. my processor supports only MMX, SSE,SSE2,SSE3,SSSE3 and EM64T instructions. SSE4.X is not supported. how can i resolve the multiplication with the availabe instructions?

You can compute the lower 16 bits and the upper 16 bits of the 32-bit results separately. Afterwards, you will need to interleave them in order to get the full 32-bit results. Something like this should work:

_m128i hi = _mm_mulhi_epi16(a, b);

_m128i lo = _mm_mullo_epi16(a,b);

_m128i r0 = _mm_unpacklo_epi16(lo, hi);

_m128i r1 = _mm_unpackhi_epi16(lo,hi);

a and b contain 8 16-bit values that you would like to multiply. r0 contains the first 4 32-bit results; r1 contains the remaining 4 32-bit results. These instructions come with the SSE2 instruction set.

Login to leave a comment.