I wanted to clarify how work groups are mapped to HD 4000 hardware. My understanding is that a workgroup maps to a single EU, and each EU can run multiple workgroups in parallel.
Is this correct?
A workgroup is executed within a half-slice (a collection of EUs). Multiple workgroups can be executed on the same half-slice. So your assumption may or may not be correct.
The order in which a work item within a work group gets distributed is- SIMD unit- spread accross EUs- spread accross threads
Hope that makes sense.
For more information, please attendour webinar about Writing Efficient Code for OpenCL Applications on 3rd Generation Intel Core Processors.
Interesting. It is not 100% clear to me right now, but I will look at the docs again and attend the webinar and potentially come back to this question later if it is still not clear :)