Writing to a Shared Resource

According to the OpenCL™ specification, you need to ensure that the commands that change the content of a shared memory object, complete in the previous command queue before the memory object is used by commands, executed in another command-queue. One way to achieve this is using events:

cl_event eventGuard;
cl_buffer bufferShared=clCreateBuffer(shared_context,CL_MEM_READ_WRITE…);
//Populating the buffer from the host, queue is regular in-order 
clEnqueueWriteBuffer(cpu_queue, bufferShared,…);
//Setting the arguments and processing buffer with a kernel
SetKernelArg(kernel, 0, sizeof(cl_mem), (void*)&bufferShared);
…
clEnqueueNDRangeKernel(cpu_queue, kernel, … &eventGuard);
//make sure the first device is done
clWaitForEvents(1, &eventGuard);
//alternatively you can use clFinish(cpu_queue) if in the same thread
…
//Now using buffer by second device
clEnqueueWriteBuffer(gpu_queue, bufferShared,…);
clEnqueueNDRangeKernel(gpu_queue, kernel, … &eventGuard);
…

If you want to write data (or output kernel results) to the same buffer simultaneously on two devices, use properly aligned, non-overlapping sub-buffers.

cl_buffer bufferShared = clCreateBuffer(shared_context, CL_MEM_ WRITE …);
//make sure alignment for the resp devices 
cl_int gpu_align;
clGetDeviceInfo(gpuDeviceId, CL_DEVICE_MEM_BASE_ADDR_ALIGN,…&gpu_align);
gpu_align /= 8; //in bytes
//make sure that cpuPortion is properly aligned first!
cl_buffer_region cpuBufferRegion = { 0, cpuPortion};
cl_buffer_region gpuBufferRegion = { cpuPortion, theRest};
cl_buffer subbufferCPU = clCreateSubBuffer(bufferShared, 0, 
        CL_BUFFER_CREATE_TYPE_REGION, &cpuBufferRegion, &err);
cl_buffer subbufferGPU = clCreateSubBuffer(bufferShared, 0, 
        CL_BUFFER_CREATE_TYPE_REGION, &gpuBufferRegion, &err);
//now work with 2 sub-buffers on 2 devices simultaneously - (refer to the //prev. section)
..
//the sub-resources should be released properly
clReleaseMemObject(subbufferCPU);
clReleaseMemObject(subbufferGPU);
clReleaseMemObject(bufferShared);

See Also

The OpenCL™ 1.2 Specification at http://www.khronos.org/registry/cl/specs/opencl-1.2.pdf

For more complete information about compiler optimizations, see our Optimization Notice.