I have two questions about the automatic vectorization process in OpenCL on Xeon Phi.
First, I was wondering what was happening when you take a work-group size of 1 on an automatically vectorized kernel. Will it be executed at the same speed that it would have without the vectorization?
And secondly, is it possible to determine if the automatic vectorization has been made by merging multiple work-items in one, or inside a work-item?
I hope my questions are clear enough, but let me know if something is unclear.