Informatique parallèle

errors when running a program using _Cilk

I mostly have in my program _Cilk_shared for structs, global variables, and functions. When I run my program I get the following errors:

HOST--ERROR:myoiScifGetRecvID: Call recv() Header Failed! for source: 1, errno = 104

HOST--ERROR:myoiScifSend: Call send() Failed! errno = 104

HOST--ERROR:myoiSend: Fail to send message!

HOST--ERROR:_myoiWatchdogDaemon: could not send to target: 1

offload error: process on device 0 was terminated by signal 11 (SIGSEGV)

warnings while compiling

When i compile my programs I get the following warnings

ipo: warning #11010: *MIC* file format not recognized for /usr/lib64/libm.so

same for libthread.so.0, libc.so.6, ld-linux-x86-64.so.2, libdl.so

AND

x86_64-k1om-linux-ld: skipping incompatible /use/lib64/libm.so when searching for -lm

same for -lpthread, -lc, -ldl

Why do I get these? Thanks.

Assembler support for the MIC when porting compilers

I am about to port the Vector Pascal compiler to target the MIC. Whilst it should be easy to do a quick port supporting automatic multi-core parallelism, support for the SIMD instructions is harder. I need to know if the gnu assembler distributed with the system has been extended to recognise the SIMD opcodes for the new instructionset. If that is not the case I will need to go through the much more laborious but still feasible process of generating the new instructions as assembler macros. I get my system later this week.

How to convert _mm512 to float

Is there an easy way to extract component 0 from _mm512 vector ?

Looking at assembly of _mm512_reduce_gmin_ps it really computes an _mm512 (of course), which is then passed to scalar operations.

I tried doing 

static inline float _mm512_get_first_ps(_mm512 v)

{

return v.__m512_f32[0] ;

}

but this does not work..

using jagged array on host and coprocessor

i have the following code:

// test2.c

#pragma offload_attribute (target(mic))

int phi()

{

 return data[0][0]+data[0][1];

}

int main()

{

 int i,j;

 int **data;

 data=(int**)malloc(3*sizeof(int*));

 data[0]=(int*)malloc(2*sizeof(int));

 data[1]=(int*)malloc(4*sizeof(int));

 data[2]=(int*)malloc(8*sizeof(int));

 for(i=0; i<2; i++) data[0][1]=10;

...

// done filling data

 #pragma offload target(mic)

 j=phi();

compiling a simple program

i have the following programs:

// test.f90

program test

implicit none

integer::i

integer,dimension(100)::arr

!dir$ attributes offload:mic :: calc

real,external::calc

!dir$ omp offload target(mic)

!$omp parallel do

do i=1,100

    arr(i)=calc(i)

end do

!$omp end parallel do

end program test

//test.c

#include <offload.h>

#pragma offload_attribute (target(mic))

float calc_(int *data)

{

 return *data+1;

Why is Cilk™ Plus not speeding up my program? (Part 1)

In this article, I discuss some common performance pitfalls in Cilk™ Plus programs that prevent users from seeing speedups in their code, and describe some techniques for avoiding these pitfalls.
  • Développeurs
  • Professeurs
  • Étudiants
  • C/C++
  • Intel® Cilk™ Plus
  • performance
  • Informatique parallèle
  • Parallélisation
  • S’abonner à Informatique parallèle