How to use scatter/gather operations on MIC

How to use scatter/gather operations on MIC

Hi,

I want to use scatter/ gather operation on MIC. I could not find any example which shows the usage of these operations. I wrote one sample program which scatter an array. if comment "mm512_i32scatter_ps"  operation, program is executing with out any problem. if i use "mm512_i32scatter_ps" operation, I am getting error "offload error: process on the device 0 was terminated by signal 11 (SIGSEGV)". Please some help me out.

 

 

 

#include<stdio.h>
#include<stdlib.h>
#include<immintrin.h>

int main()
{
printf("CPU\n");

#pragma offload target(mic)
{
int* a = (int*)_mm_malloc(sizeof(int)*16,64);
for(int i=0; i<16;i++){
  a[i] = i;

}
int* index = (int*)_mm_malloc(sizeof(int)*16,64);
for(int i=0; i<16;i++){
index[i] = 15-i;

}
printf("xeonphi\n");
int* b = (int*)_mm_malloc(sizeof(int)*16,64);
__m512i index1 = _mm512_load_epi32((void*)index);
__m512 v1 = _mm512_load_ps((void*)a);
_mm512_i32scatter_ps((void*)b,index1,v1,1);
}
}

 

5 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.
Best Reply

Hi Please try to define indexes that are aligned in the boundaries of the datatype in use. Note that the first argument in extern void __cdecl _mm512_i32scatter_ps(void* mv, __m512i index, __m512 v1, int scale) is byte aligned. So the “index” variable should take that into account. Try to change from index[i] = 15-i; to index[i] = (15-i)*sizeof(int); Best, Leo.

Thank you lee..

your suggestion is working.

 I actually want to scatter 1024 elements into two groups. even index elements will be in one group and odd index elements will be in other group. For instance if my input array is {0,1,2,3,4,5,6,7,8,9,10}. output should be {0,2,4,6,10,1,3,5,7,9} or two seperate arrays{0,2,4,6,8,10}, {1,3,5,7,9}.

one possible solution can be create an index array = {0,5,1,6,2,7,3,8,4,9,5,10}.

But here i am doing a strided access. So is there any intrensic which serves this purpose.

Thank you lee..

your suggestion is working.

 I actually want to scatter 1024 elements into two groups. even index elements will be in one group and odd index elements will be in other group. For instance if my input array is {0,1,2,3,4,5,6,7,8,9,10}. output should be {0,2,4,6,10,1,3,5,7,9} or two seperate arrays{0,2,4,6,8,10}, {1,3,5,7,9}.

one possible solution can be create an index array = {0,5,1,6,2,7,3,8,4,9,5,10}.

But here i am doing a strided access. So is there any intrensic which serves this purpose.

Hello Shiva,

Have you tried to look either at the Permute instruction (_mm512_mask_permutevar_epi32) or the Swizzle ( _mm512_swizzle_*). With proper masks you should be able to get what you want.

Maybe load groups of 16 elements of your input array into a register and use Permute or Swizzle to populate two vector registers: one for odd, and another one for the even subset.

I hope this helps,

Leo.

Login to leave a comment.