Intrinsic guide 2.6 error in documentation

Intrinsic guide 2.6 error in documentation

gilgil's picture

In the documentation the intrinsic _mm_mulhrs_epi16 the shift right should be 15 and not 14.

5 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.
Patrick Konsor (Intel)'s picture

14 bits is correct. See the Instruction Set Reference in the Software Developer's Manual:http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html

PMULHRSW (with 128-bit operand)

temp0[31:0] = INT32 ((DEST[15:0] * SRC[15:0]) >>14) + 1;
temp1[31:0] = INT32 ((DEST[31:16] * SRC[31:16]) >>14) + 1;

temp2[31:0] = INT32 ((DEST[47:32] * SRC[47:32]) >>14) + 1;

temp3[31:0] = INT32 ((DEST[63:48] * SRC[63:48]) >>14) + 1;

temp4[31:0] = INT32 ((DEST[79:64] * SRC[79:64]) >>14) + 1;

temp5[31:0] = INT32 ((DEST[95:80] * SRC[95:80]) >>14) + 1;

temp6[31:0] = INT32 ((DEST[111:96] * SRC[111:96]) >>14) + 1;

temp7[31:0] = INT32 ((DEST[127:112] * SRC[127:112) >>14) + 1;

DEST[15:0] = temp0[16:1];
DEST[31:16] = temp1[16:1];

DEST[47:32] = temp2[16:1];

DEST[63:48] = temp3[16:1];

DEST[79:64] = temp4[16:1];

DEST[95:80] = temp5[16:1];

DEST[111:96] = temp6[16:1];

DEST[127:112] = temp7[16:1];

gilgil's picture

I still do not understand...

I try the next piece of code
float factor = 1.f;
__m128i vFactor = _mm_set1_epi16(factor*(1<<14)); // Using fixed point..

__m128i inputVec = _mm_set_epi16(32,54,124,75,35,235,244,36);

__m128i resultVec = _mm_mulhrs_epi16(inputVec,vFactor);

By your explanation I should get resultVec = inputVec but the result elements are actually half the original values..

sirrida's picture

If you carefully read the documentation you will notice an additional hidden shift by 1.
The temp*[16:1] can be read as (temp*[31:0]>>1)[15:0].

It might make sense to make the documentation more evident about this.

gilgil's picture

I agree the documentation for this function is not the best one.

Login to leave a comment.