I have tried to make some OpenCL-related performance optimization for Intel devices. I want to use vectorization and vector data type with optimal lenght for specified device. I called clGetDeviceInfo(.., CL_DEVICE_PREFERRED_VECTOR_WIDTH, ..) method, but it returns not really optimal values:
uchar - 1 short - 1 int - 1 float - 1
I checked it on GPU Intel HD4600 and CPU Intel Core i5-4570.
I have tried to find the optimal value of the vector length for my problem and got following values:
uchar - 16 short - 8 int - 1 float - 1
If I use uchar16 instead uchar I get x3 acceleration.
I have two question:
1. Why is clGetDeviceInfo(.., CL_DEVICE_PREFERRED_VECTOR_WIDTH, ..) return these values?
2. Is it possible to change these values in future releases? This will make possible to do cross-platform optimization.