I've found a bug in the 2013 beta Linux build: when using clSetKernelArg to set an int3 argument, it fails if sizeof(cl_int3) is used as the size. It accepts 3 * sizeof(cl_int), but the CL spec is pretty clear that the API type corresponding to int3 is cl_int3, not cl_int[3] (there's a table at the start of section 6.1.2 in the CL 1.1 spec). The sizes are different because cl_int3 is padded to 4*sizeof(cl_int).
Both the NVIDIA and AMD OpenCL implementations expect the size to be sizeof(cl_int3).



