Compiling OpenCL 2.0 atomics


I am trying to compile a simple kernel using OpenCL 2.0 atomics using exactly the device, driver, and kernel described in:

However, I cannot even get the kernel to compile, as it does not seem to recognize the atomic types and functions. My error log (along with some environment info) is:

change image format of image object in opencl

I have written a small code to create an image object in opencl as below:

img_fmt.image_channel_order = CL_R;
img_fmt.image_channel_data_type = CL_UNSIGNED_INT8;      
memobj_in_luma = clCreateImage2D(p->context, CL_MEM_READ_ONLY, &img_fmt, p->width, p->height, 0, NULL, &ret);

After creating this object I want to change the image format to CL_RGBA. Is there any way to do this?

HD4400 bitwise and operation on uchar2 data

we are seeing different results when implementing "bitwise and" operation in OpenCL kernel working on uchar2 data. The OpenCL kernel code like this:
uchar2 val1;
uchar2 val2;
uchar3 res;
res = val1 & val2;

produce wrong results, while code like below:

res = (uchar2)(val1.x & val2.x, val1.y & val2.y)
produce correct result.

BTW, the same behaviour detected for bitwise or/xor and uchar3/4 data, although attached test case was prepared only for "bitwise and" on uchar2 data. 

What Resources Are Available to Developers Using Intel® INDE?

New to Intel® Integrated Native Develop Experience (also called Intel® INDE)? Are you confused as to what is available and how it can help you? INDE is a cross-platform productivity suite, which provides developers many resources and tools to develop, debug and optimize their applications, while achieving native performance and look-and-feel. In this article, we will introduce the components of INDE and show how developers can use them to create new applications and optimize existing applications. To start with Intel® INDE provides support for IDE integration.

setting work_group_size crashes OpenCL on Intel CPU


I am transfering the reduction kernel from amd app sdk.

It requires setting work_group_size when you execute

clEnqueueNDRangeKernel  with local_work_size that is different from 8 it crashes directly in tbb on Intel OpenCL for Intel CPU. The clEnqueueNDRRange successfully launches the kernel.

When you request work_group_size from the device it returns 8192 (should be 8 in this case) and the kernel work group size is 2048. It crashes with both settings.

Works only with the number of the cores.

I have Intel Haswell 4770K.

HD4400 clEnqueueCopyBufferRect issue?


we've detected suspicious behaviour of clEnqueueWriteBufferRect/clEnqueueCopyBufferRect functions which is demonstrated with simple test case attached. The test case depends on OpenCL API only. This work correctly on AMD Tahiti but not on Intel HD4400, HD4600.


The problem is in copying rectangle of interest with some specific parameters from whole image, which is kept in cl buffer.


The short description of test case:

1. create opencl buffer for whole image (not initialized)

GPU HD4600 opencl kernel problem

Hi, i am compiling offline spir kernel.

When i use it on HD4600 GPU i get the following when I invoke clBuildProgram

error: IGILTargetLowering::Call(): unhandled function call!

Call made to: _Z13get_global_idj()
0x7c53480: i64 = GlobalAddress<i64 (i32)* @_Z13get_global_idj> 0 [ORD=1]
error: midlevel compiler failed build.

The same kernel works fine on amd gpu and on intel cpu. Also works fine if the kernel is compiled as spir64

Iscriversi a OpenCL*