Intel OpenCL SDK: clGetKernelWorkGroupInfo return value

Intel OpenCL SDK: clGetKernelWorkGroupInfo return value

Hi 

1. I have a global work size of 1024 by 1024.

2. I set the local work size to 16 by 16. 

3. My CPU opnecl device has a maximum work-group-size of 8192.

4. I call clEnqueueNDRangeKernel with the desired local-work-size (along with all other necessary parameters)

5. I call:

      a. clGetKernelWorkGroupInfo(kernel, device, CL_KERNEL_WORK_GROUP_SIZE, sizeof(size_t), (void*)&workGroupSizeUsed, NULL);

      b. clGetKernelWorkGroupInfo(kernel, device, CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE, sizeof(size_t), (void*)&workGroupSizeUsed, NULL);

6. Both calls return 8192. How is this possible?

My expectation is 16 - the value that I passed to it. 

Any help?

2 帖子 / 0 全新
最新文章
如需更全面地了解编译器优化,请参阅优化注意事项

Querying clGetKernelWorkGroupInfo with CL_KERNEL_WORK_GROUP_SIZE returns the maximum work group size supported for that kernel as determined by its resource utilization (e.g., private, local memory) or kernel attribute max_work_group_size.  On the other hand, CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE is a performance hint often used to map to a device's underlying SIMD architecture.  The two may return the same value, and CPUs seem like they would benefit from large work group sizes more than small work group sizes.

发表评论

登录添加评论。还不是成员?立即加入