Manually control MIC SIMD operations.

I wants to manually manage my code's the SIMD operations on MIC, and write the intrinsics below

_k_mask = _mm512_int2mask(0x7ff); // 0000 0111 1111 1111
_tempux2_512 = _mm512_mask_loadunpacklo_ps(_tempux2_512,_k_mask, &u_x[POSITION_INDEX_X(k,j,i-5)]);
_tempux2_512 = _mm512_mask_loadunpackhi_ps(_tempux2_512,_k_mask, &u_x[POSITION_INDEX_X(k,j,i-5)]+16);

And the compiler icpc gives these error message.

test.cpp:574: undefined reference to `_mm512_mask_extloadunpacklo_ps'
test.cpp:575: undefined reference to `_mm512_mask_extloadunpackhi_ps'

It will be okay to compile if I use _mm512_mask_load_ps, but my memory cannot be 64-byte-aligned so using _mm512_mask_load_ps will cause an runtime error.

Then I tried to write inline asm block manually like this

MOV rax,0x7ff
KMOV k1,rax
VMOVAPS zmm1 {k1}, [data_512_1]
VMOVAPS zmm2 {k1}, [data_512_2]
VMULPS	zmm3 {k1}, zmm2 zmm1
VMOVAPS [data_512_3] {k1}, zmm3

And the compiler icpc shows error again

test_simd.cpp(30): (col. 10) error: Unknown opcode KMOV in asm instruction .
test_simd.cpp(33): (col. 10) error: Syntax error ZMM1 in asm instruction vmulps.

I'm a beginner of assembly language,It would be really grateful if anyone can tell me why icpc didn't find the reference and how to fix it,or could recommend some materials to me. (I've read the Intel® Xeon Phi™ Coprocessor Instruction Set Architecture Reference Manual but still do not know how to write it.)

Thanks a lot.

_mm512_mask_extloadunpackhi_ps is not available on KNC. If you can't align your data use _mm512_i32extgather_ps for loading your data into a register. But I think it would be better to align your data, otherwise the processor has to load several cache lines for loading data into a register. Alignment is always possible if you use padding techniques for the memory layout.

