OpenCL on Xeon Phi

OpenCL on Xeon Phi

I am having problems compiling some OpenCL examples on the Xeon Phi when I use the CCFLAG option -mmic I get the following build error:

x86_64-k1om-linux-ld: cannot find -lOpenCL

However, when I remove the -mmic option the code builds. So do I need to use the -mmic flag to build OpenCL code that runs efficiently on the Xeon Phi?

Also, is there a webpage that describes how to build and run efficient OpenCL code on the Xeon Phi?

Thanks David

publicaciones de 5 / 0 nuevos
Último envío
Para obtener más información sobre las optimizaciones del compilador, consulte el aviso sobre la optimización.

Hi David,

The -mmic option creates an application that runs natively on Xeon Phi. So you don't need to specify it when building OpenCL application.

The starting point for OpenCL on Xeon Phi is:
http://software.intel.com/en-us/vcsource/tools/opencl-sdk-xe

Specifically,
user guide: http://software.intel.com/sites/products/documentation/ioclsdk/2013XE/UG...
optimization guide: http://software.intel.com/sites/products/documentation/ioclsdk/2013XE/OG...
support forum: http://software.intel.com/en-us/forums/intel-opencl-sdk/

Thanks,
Yuri

Hello ,

I am some problems in vectorizing(float16) the prefix sum kernel using opencl on intel xeon phi .

I am able tow work it out for float data type but the profiling numbers seems petty high.

Please anybody suggest some example for the same.

Regards

Rishab Goel

KNC doesn't have adequate native support for float16, to my knowledge, so it seems academic to attempt vectorization.   Jim Dempsey posted suggestions for vectorization with native data types. It seems simpler to me to settle for the roughly 50% speedup over the plain sequential implementation which can be obtained with a sort of unroll and jam with the recursion penalty taken only every 4th element.  It's certainly not something which shows KNC in a good light.

Hello Tim,

How much gain could we get on such a parallel prefix sum calculation kernel according to your experience?

Instead of using float16 could I used for loops instead and rely on compiler to vectorize !!

Regards

Rishab Goel

Deje un comentario

Por favor inicie sesión para agregar un comentario. ¿No es socio? Únase ya