OpenCL on Xeon Phi

OpenCL on Xeon Phi

I am having problems compiling some OpenCL examples on the Xeon Phi when I use the CCFLAG option -mmic I get the following build error:

x86_64-k1om-linux-ld: cannot find -lOpenCL

However, when I remove the -mmic option the code builds. So do I need to use the -mmic flag to build OpenCL code that runs efficiently on the Xeon Phi?

Also, is there a webpage that describes how to build and run efficient OpenCL code on the Xeon Phi?

Thanks David

5 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Hi David,

The -mmic option creates an application that runs natively on Xeon Phi. So you don't need to specify it when building OpenCL application.

The starting point for OpenCL on Xeon Phi is:
http://software.intel.com/en-us/vcsource/tools/opencl-sdk-xe

Specifically,
user guide: http://software.intel.com/sites/products/documentation/ioclsdk/2013XE/UG...
optimization guide: http://software.intel.com/sites/products/documentation/ioclsdk/2013XE/OG...
support forum: http://software.intel.com/en-us/forums/intel-opencl-sdk/

Thanks,
Yuri

Hello ,

I am some problems in vectorizing(float16) the prefix sum kernel using opencl on intel xeon phi .

I am able tow work it out for float data type but the profiling numbers seems petty high.

Please anybody suggest some example for the same.

Regards

Rishab Goel

KNC doesn't have adequate native support for float16, to my knowledge, so it seems academic to attempt vectorization.   Jim Dempsey posted suggestions for vectorization with native data types. It seems simpler to me to settle for the roughly 50% speedup over the plain sequential implementation which can be obtained with a sort of unroll and jam with the recursion penalty taken only every 4th element.  It's certainly not something which shows KNC in a good light.

Hello Tim,

How much gain could we get on such a parallel prefix sum calculation kernel according to your experience?

Instead of using float16 could I used for loops instead and rely on compiler to vectorize !!

Regards

Rishab Goel

Leave a Comment

Please sign in to add a comment. Not a member? Join today