Bugs in Intrinsics Guide

155 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

I'm confused by the operation of _mm512_i32extscatter_epi32.

If you use conv=_MM_DOWNCONV_EPI32_NONE and hint=_MM_HINT_NONE this intrinsic should be equal to _mm512_i32scatter_epi32.

For example, when using _MM_DOWNCONV_EPI32_UINT8, take j=15, then i=480 and n=120, and addr[127:120]:=UInt32ToUInt8(v1[511:480]), Are we really using 128-bit addresses? The operation of _mm512_i32scatter_epi32 does make a lot more sense. See below.

Can someone please explain how the operation of the _mm512_i32extscatter_epi32 should be read?

Regards Henk-Jan.

---
void _mm512_i32extscatter_epi32 (void * mv, __m512i index, __m512i v1, _MM_DOWNCONV_EPI32_ENUM conv, int scale, int hint)
Operation:

FOR j := 0 to 15
    addr := MEM[mv + index[j] * scale]
    i := j*32
    CASE conv OF 
        _MM_DOWNCONV_EPI32_NONE: 
            addr[i+31:i] := v1[i+31:i]
        _MM_DOWNCONV_EPI32_UINT8: 
            n := j*8 
            addr[n+7:n] := UInt32ToUInt8(v1[i+31:i])
        _MM_DOWNCONV_EPI32_SINT8:
            n := j*8
            addr[n+7:n] := SInt32ToSInt8(v1[i+31:i])
        _MM_DOWNCONV_EPI32_UINT16:
            n := j*16 
            addr[n+15:n] := UInt32ToUInt16(v1[i+31:i]) 
        _MM_DOWNCONV_EPI32_SINT16: 
            n := j*16 
            addr[n+15:n] := SInt32ToSInt16(v1[n+15:n]) 
    ESAC 
ENDFOR 

---
void _mm512_i32scatter_epi32 (void* base_addr, __m512i vindex, __m512i a, int scale)
Operation: 

FOR j := 0 to 15
    i := j*32 
    MEM[base_addr + SignExtend(vindex[i+31:i])*scale] := a[i+31:i] 
ENDFOR

 

Selecting AVX512_4FMAPS instruction set, one intrinsic is missing:  _mm512_4fmadd_ps

That intrinsic is in the tool; it is just missing from the filtered list AVX512_4FMAPS.

 

The entire instruction sets which uses masks as input, such as kmov, kshift, kand and so on is absent from the guide. I had to use those several times and it's harder with their absence.

Quote:

Eden S. (Intel) wrote:

The entire instruction sets which uses masks as input, such as kmov, kshift, kand and so on is absent from the guide.

The guide does describe some intrinsics like _mm512_kmov, _mm512_kand, etc. but they mostly deal with 16-bit masks and are not extracted to a separate category, which would be useful for searching.

 

Pages

Leave a Comment

Please sign in to add a comment. Not a member? Join today