Sandy Bridge CPU & Native vector width


I originally tried the SDK on Linux on a dual-socket Harpertown, where CL reports the preferred & native width for all datatypes as 16 bytes, i.e. from 16 chars to 2 doubles. That's what fit inside a XMM register, which is what I expected.

But after checking the values on a Sandy Bridge CPU (i5-2400), I get the same preferred & native sizes. This seems strange to me, as my understanding of the architecture is that if the dataset is large enough and floating-point, one should go for AVX instead of SSE. There is very little support for integer stuff in YMM registers, so I understand that char/short/int/long are still 16/8/4/2, but shouldn't float/double be 8/4 rather than 4/2?

Is it deliberate and if so why, or is it just a case of "we haven't had time to implement it yet"?


Best Reply

Thanks for pointing that,

The behavior you see on Sandy Bridge CPU is as expected with
this Beta.

As the install based of Sandy Bridge will increase and mature
into to the domains where OpenCL based floating-point applications are in used,
we will extend our support.

So yes, the behaviour is
not yet implemented and will be added in future versions.