Intrinsic guide 2.6 error in documentation

Intrinsic guide 2.6 error in documentation

imagem de gilgil

In the documentation the intrinsic _mm_mulhrs_epi16 the shift right should be 15 and not 14.

5 posts / 0 new
Último post
Para obter mais informações sobre otimizações de compiladores, consulte Aviso sobre otimizações.
imagem de Patrick Konsor (Intel)

14 bits is correct. See the Instruction Set Reference in the Software Developer's Manual:http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html

PMULHRSW (with 128-bit operand)

temp0[31:0] = INT32 ((DEST[15:0] * SRC[15:0]) >>14) + 1;
temp1[31:0] = INT32 ((DEST[31:16] * SRC[31:16]) >>14) + 1;

temp2[31:0] = INT32 ((DEST[47:32] * SRC[47:32]) >>14) + 1;

temp3[31:0] = INT32 ((DEST[63:48] * SRC[63:48]) >>14) + 1;

temp4[31:0] = INT32 ((DEST[79:64] * SRC[79:64]) >>14) + 1;

temp5[31:0] = INT32 ((DEST[95:80] * SRC[95:80]) >>14) + 1;

temp6[31:0] = INT32 ((DEST[111:96] * SRC[111:96]) >>14) + 1;

temp7[31:0] = INT32 ((DEST[127:112] * SRC[127:112) >>14) + 1;

DEST[15:0] = temp0[16:1];
DEST[31:16] = temp1[16:1];

DEST[47:32] = temp2[16:1];

DEST[63:48] = temp3[16:1];

DEST[79:64] = temp4[16:1];

DEST[95:80] = temp5[16:1];

DEST[111:96] = temp6[16:1];

DEST[127:112] = temp7[16:1];

imagem de gilgil

I still do not understand...

I try the next piece of code
float factor = 1.f;
__m128i vFactor = _mm_set1_epi16(factor*(1<<14)); // Using fixed point..

__m128i inputVec = _mm_set_epi16(32,54,124,75,35,235,244,36);

__m128i resultVec = _mm_mulhrs_epi16(inputVec,vFactor);

By your explanation I should get resultVec = inputVec but the result elements are actually half the original values..

imagem de sirrida

If you carefully read the documentation you will notice an additional hidden shift by 1.
The temp*[16:1] can be read as (temp*[31:0]>>1)[15:0].

It might make sense to make the documentation more evident about this.

imagem de gilgil

I agree the documentation for this function is not the best one.

Faça login para deixar um comentário.