Adding vectors with MIC

Adding vectors with MIC


Is the following code sample for adding vectors correct? Can I make it even faster using vectorized operations?

void vectorAdd(float*a, float*b, float* r,int size)

   #pragma offload target(mic) in(a:length(size)) in(b:length(size)) inout(r:length(size)) 
   #pragmaopenmp parallel for shared(a,b,r) private(i)
   for(inti=0; i<size; ++i)

      r[i] = a[i]+b[i];



2 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

You probably need float * restrict a, float * restrict b, float *restrict r   (or one of the ivdep pragmas) to get auto-vectorization.  Alignment would help if you make all the OpenMP chunks a multiple of 32.

A single offloaded vector operation like this would spend a majority of the time on data transfer.

Leave a Comment

Please sign in to add a comment. Not a member? Join today